Question: Suppose we run the KMeans algorithm with K = 2 on the following set of 2 D data points: { ( 2 , 3 )

Suppose we run the KMeans algorithm with K =2 on the following set of 2D data points: {(2,3),(3,3),(9,5),(10,5),(11,5),(2,4),(10,6),(3,4),(9,6),(11,6)} starting with initial centroids at (2,3) and (9,5).
1. Perform one iteration of the KMeans algorithm (assignment and update steps). Provide the new centroids after this iteration and assign each point to the nearest centroid.
2. The KMeans algorithm aims to minimize the Within-Cluster Sum of Squares (WCSS). Explain the reasoning behind this objective function. What are the potential limi- tations or drawbacks of this approach?
3. Investigate the impact of initial centroid selection in the KMeans algorithm.
(a) Analyze at least two common strategies, discussing their pros and cons.
(b) Propose your own method for selecting initial centroids and validate its potential benefits over the existing methods that you studied in (a) using a challenging dataset of your choice.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!