Question: Problem Statement: k-nearest neighbor classification for the Iris data set. The Iris data set has three species: Setosa, Versicolor, and Virginica. Each species has 50

Problem Statement: k-nearest neighbor classification for the Iris data set.

The Iris data set has three species: Setosa, Versicolor, and Virginica. Each species has 50 data points. We can call each species a class. Consider the first 40 data points of each class as training samples and the remaining 10 data points of each species as test/target data points.

Use k-Nearest Neighbor (k-NN) algorithm to classify those test/target data points into proper species/classes.

Consider using different values for k (1, 3, 9).

For each value of k, consider different distance metric () with the following general distance measure or norm:

where are two data points, d is the number of dimensions or features of each data point.

Submit:

RStudio code of your solution.

Fill out the following table with the number of incorrectly classified data points in each scenario (k, n, genuine species of 10 test data points).

Genuine species of 10 test data points

=

(1, 1)

=

(1, 2)

=

(1, infinity)

=

(3, 1)

=

(3, 2)

=

(3, infinity)

=

(9, 1)

=

(9, 2)

=

(9, infinity)

Setosa

Versicolor

Virginica

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!