Question: Implement the K-means algorithm for the four-dimensional data given. The starting points for k cluster means can be chosen randomly K =3 Run the algorithm

Implement the K-means algorithm for the four-dimensional data given.

The starting points for k cluster means can be chosen randomly

K =3

Run the algorithm 10 times with various initial points and then compute sum of squared error after each run.Select the solution that gives the lowest sum of squared error over 10 runs.For the convergence, check if any data points change its cluster assignments relative to its previous assignment; that is if 1 or more points change cluster assignment, continue to optimize it.

For each cluster Ci, give its

(i) Its mean

(ii) Size of cluster

(iii) List of data which are assigned to each cluster

Given below is the data set which is to be in a data file.

Write a program for it.

data dimension
number d1 d2 d3 d4
x1 5.1 3.5 1.4 0.2
x2 4.9 3 1.4 0.2
x3 4.7 3.2 1.3 0.2
x4 4.6 3.1 1.5 0.2
x5 5 3.6 1.4 0.2
x6 6.7 3 5.2 2.3
x7 6.3 2.5 5 1.9
x8 6.5 3 5.2 2
x9 6.2 3.4 5.4 2.3
x10 5.9 3 5.1 1.8

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!