Question: This question should be answered using the Ionosphere data set, which is part of the mlbench package. This radar data was collected by a
This question should be answered using the Ionosphere data set, which is part of the mlbench package. This radar data was collected by a system in Goose Bay, Labrador. The data frame consists of 351 observations on 35 independent variables. The last column in the dataframe is a categorical variable, "Class", defining the free electrons in the ionosphere: "good" radar returns are those showing evidence of some type of structure in the ionosphere. "bad" returns are those that do not. (a) Produce some numerical and graphical summaries of the Ionosphere data. Do there appear to be any patterns? (b) Notice that the second the column contains only one single value, so remove that column and work on the rest of the questions using the new dataset. Perform a K-Nearest Neighbors (KNN) algorithm with K = 1, where Class is the response, and the rest columns in the dataset as predictors. (c) Compute the confusion matrix and overall fraction of correct predictions. Explain what the confusion matrix is telling you about the types of mistakes made by KNN algorithm. (d) Split the data randomly into a training set (70%) and a test set (30%). Make sure to use set.seed (4323), for reproducible results. Fit the KNN model (K = 3) (e) Repeat (d) using K = 5. (f) Repeat (d) using K = 7. (g) Which of these methods appears to provide the best results on this data?
Step by Step Solution
3.44 Rating (160 Votes )
There are 3 Steps involved in it
Answer The R snippet is as ... View full answer
Get step-by-step solutions from verified subject matter experts
