Question: ARE coding Sample-splitting can be done for logistic regression too, but it just requires us to be a bit more careful about how we choose

ARE coding Sample-splitting can be done for logistic regression too, but it just requires us to be a bit more careful about how we choose the training and testing sets. In particular, we want to ensure that the ratio of 0s to 1s (assuming, without a loss of generality, that the response variable takes these two values) ends up being roughly equal in each of the training and testing sets. For the current wage dataset, consider as the response variable the indicator that `wage` is above 250 (recall, `wage` is measured in thousands of dollars!). Discard, for the moment, all observations that correspond to an education level of less than HS graduate (recall, the reason for this important step was explored in Lab 12 __4(c)__ and __(d)__). Then split the remaining observations into training and testing sets, but do so in a way that maintains equal ratios of 0s to 1s in the two sets, as best as possible. Once you have done this, fit two models on the training set. The first is a logistic model of `I(wage>250)` on `year`, `age`, and `education`. The second is an additive logistic model of `I(wage>250)` on `year`, `s(age)`, and `education`. Now, on

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!