Question: Consider a synthetic data set that was generated in the following fashion. The explanatory variable follows a standard normal distribution. The response label is 0

Consider a synthetic data set that was generated in the following fashion. The explanatory variable follows a standard normal distribution. The response label is 0 if the explanatory variable is between the 0.95 and 0.05 quantiles of the standard normal distribution, and 1 , otherwise. The data set was generated using the following code.

import numpy as np import scipy.stats # generate data np.random.seed(12345) N =

Compare the \(K\)-nearest neighbors classifier with \(K=5\) and logistic regression classifier. Without computation, which classifier is likely to be better for these data? Verify your answer by coding both classifiers and printing the corresponding training \(0-1\) loss.

import numpy as np import scipy.stats # generate data np.random.seed(12345) N = 100 X = np.random. randn (N). q= scipy.stats.norm.ppf (0.95) y np.zeros(N) y[X>=q] = 1 y[X

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Statistical Techniques in Business Questions!