Question: 1. (15 pts) Given matrix X, where each row represents a different data point. You are asked to perform k-means clustering on this dataset using

1. (15 pts) Given matrix X, where each row represents a different data point. You are asked to perform k-means clustering on this dataset using the Euclidean distance as the distance function. Here k is chosen as 3. The Euclidean distance between a vector x-[x1, x2,...,xp] and a vector y [yl,y2,...,yp] both in RP is defined as d = (L2 loss between x and y). All data in X were given below. Three points i.e., (x3, x5, x8) were randomly chosen as the initialized centers of three clusters which are ul-(5.5,3.0), M2 = (6.5,3.0), 13 = (6.5,3.5). X = [x1, x2, x3, x4, x5, x6, x7, x8] = [(3.5, 3.5), (3.0, 3.0), (5.5, 3.0), (5.5, 2.5), (6.5, 3.5), (5.0, 4.0), (5.5, 3.5), (6.5, 3.0) (1) What's the center of the first cluster (ul) after one iteration? (hints: one iteration includes two steps i.e., assign data to cluster and estimate the new cluster center). (3 pts) (2) What's the center of the second cluster (12) after two iterations? (3 pts) (3) How many iterations are required for K-means algorithm to converge? Specify the final centers of these three clusters when converged. (6 pts) (4) Using X, perform simple K-means clustering using scikit-learn library and visualize the final results. (3 pts)
1. (15 pts) Given matrix X, where each row represents a different data point. You are asked to perform k-means clustering on this dataset using the Euclidean distance as the distance function. Here k is chosen as 3. The Euclidean distance between a vector x=[x1, x2,...,xp] and a vector y=[y1,y2,,yp] both in R is defined as d = (L2 loss between x and y). All data in X were given below. Three points i.e., (x3, x5, x8) were randomly chosen as the initialized centers of three clusters which are l= (5.5,3.0), 2 = (6.5,3.0), 3 = (6.5,3.5). X = [x1, x2, x3, x4, x5, x6, x7, x8] = [(3.5, 3.5), (3.0, 3.0), (5.5, 3.0), (5.5, 2.5), (6.5, 3.5), (5.0, 4.0), (5.5, 3.5), (6.5, 3.0) (1) What's the center of the first cluster (l) after one iteration? (hints: one iteration includes two steps i.e., assign data to cluster and estimate the new cluster center). (3 pts) (2) What's the center of the second cluster (2) after two iterations? (3 pts) (3) How many iterations are required for K-means algorithm to converge? Specify the final centers of these three clusters when converged. (6 pts) (4) Using X, perform simple K-means clustering using scikit-learn library and visualize the final results. (3 pts)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
