Question: 1 . Consider the following clustering method called Leader Clustering. It receives two parameters: an integer k and a real number t . Similar to

1. Consider the following clustering method called Leader Clustering. It receives two parameters: an integer k and a real number t. Similar to k-means, it starts by selecting k instances
(which will be called leaders) and assigns each training instances to the cluster of the closest leader. During the assignment step, however, if the distance of a training instance to its closest leader is greater than the input threshold t, then this training instance becomes a
new leader. During the same assignment step, remaining points can be assigned to these new leaders. After all the training instances have been assigned to a leaders cluster, new leaders are calculated by averaging each cluster. The process is then repeated until the cluster assignments do not change. [Use diagram to illustrate following answers]
a. Given a dataset and a value k, let t vary from 0 to a very large value. When does Leader Clustering produce more, the same number, or fewer clusters than k-means, assuming that the k initial centers are the same for both? When will the clusterings produced by Leader Clustering and k-means be identical?
b. Which of the two methods, k-means or Leader Clustering, will be best at dealing with outliers (data instances that are far away or very different to the other instances in the dataset)? Explain.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!