Question: Undecided as the target (or response) variable, construct a series of k-nearest neighbor classifiers as directed by the following parts (a), (b), and (c). Evaluate

Undecided as the target (or response) variable, construct a series of k-nearest neighbor classifiers as directed by the following parts (a), (b), and (c). Evaluate a range of values of k and standardize the input variables to adjust for the different magnitudes of the variables. a. Use only the continuous variables (Age, HouseholdSize, Income, and Education) as input variables. For a default cutoff value of 0.5, what value of k minimizes the overall error rate on a static validation set or through a 10-fold cross-validation procedure? b. Use all eight variables as input variables. For a default cutoff value of 0.5, what value of k minimizes the overall rate on a static validation set or through a 10-fold cross-validation procedure? c. Generally, caution is recommended when combining continuous input variables and categorical input variables as the concept of distance differs for these two types of variables. Compare the overall error rates of models from parts (a) and (b). For these data, does combining variable types degrade performance?

please answer in excel and show all work.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!