Question: Consider a synthetic data set that was generated in the following fashion. The explanatory variable follows a standard normal distribution. The response label is 0
Consider a synthetic data set that was generated in the following fashion. The explanatory variable follows a standard normal distribution. The response label is 0 if the explanatory variable is between the 0.95 and 0.05 quantiles of the standard normal distribution, and 1 , otherwise. The data set was generated using the following code.

Compare the \(K\)-nearest neighbors classifier with \(K=5\) and logistic regression classifier. Without computation, which classifier is likely to be better for these data? Verify your answer by coding both classifiers and printing the corresponding training \(0-1\) loss.
import numpy as np import scipy.stats # generate data np.random.seed(12345) N = 100 X = np.random. randn (N). q= scipy.stats.norm.ppf (0.95) y np.zeros(N) y[X>=q] = 1 y[X
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
