Question: Suppose you are working on developing a classification procedure, where you have 1,200 candidate predictors (features) and only 200 observations for the class labels. To

Suppose you are working on developing a classification procedure, where you have 1,200 candidate predictors (features) and only 200 observations for the class labels. To reduce the number of candidate predictors, and focus on the more promising subset of them, you select the 200 of them having the largest absolute value of their correlation with the observed class labels. Then you fit various classification models using the subset of highly correlated predictors, and you would like to select the one model, which is expected to perform the best on a test sample. You decide to use 10-fold cross validation to estimate the test set performance of the candidate models on the subset of highly correlated predictors.Do you expect these 10-fold cross-validation estimates to be valid? Explain your answer.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!