Question: Implement the K-means algorithm for the four-dimensional data given. The starting points for k cluster means can be chosen randomly K =3 Run the algorithm
Implement the K-means algorithm for the four-dimensional data given.
The starting points for k cluster means can be chosen randomly
K =3
Run the algorithm 10 times with various initial points and then compute sum of squared error after each run.Select the solution that gives the lowest sum of squared error over 10 runs.For the convergence, check if any data points change its cluster assignments relative to its previous assignment; that is if 1 or more points change cluster assignment, continue to optimize it.
For each cluster Ci, give its
(i) Its mean
(ii) Size of cluster
(iii) List of data which are assigned to each cluster
Given below is the data set which is to be in a data file.
Write a program for it.
| data | dimension | |||
| number | d1 | d2 | d3 | d4 |
| x1 | 5.1 | 3.5 | 1.4 | 0.2 |
| x2 | 4.9 | 3 | 1.4 | 0.2 |
| x3 | 4.7 | 3.2 | 1.3 | 0.2 |
| x4 | 4.6 | 3.1 | 1.5 | 0.2 |
| x5 | 5 | 3.6 | 1.4 | 0.2 |
| x6 | 6.7 | 3 | 5.2 | 2.3 |
| x7 | 6.3 | 2.5 | 5 | 1.9 |
| x8 | 6.5 | 3 | 5.2 | 2 |
| x9 | 6.2 | 3.4 | 5.4 | 2.3 |
| x10 | 5.9 | 3 | 5.1 | 1.8 |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
