Question: Prove that k-means will produce k clusters, all non empty, or: Give an example of a set D of data points (with no repeated data

Prove that k-means will produce k clusters, all non empty, or: Give an example of a set D of data points (with no repeated data point), a value for k (k<=n, where n is the number of data objects), and a set of k data points as initial seeds, such that some cluster becomes empty. A motivation for this problem: Many or most of you will become professional programmers. The programs you write are supposed to work all the time, not just 999 times out of 1000. The pseudocode for k-means in the textbook will fail if indeed one of the clusters becomes empty.

Question: is it safe to write the code for k-means as the textbook, or will code written like that get you into some trouble with your manager (not initially, but after a while)? If you decide to try to find a malevolent data you can try the following. Try a data file of 6 to 12 data points in the plane so that when k-means is performed to k = 3 or k = 4 clusters, in iteration 2 (or iteration 3 or ...), one of the clusters becomes empty. The initial seeds must be data points; and your example should not rely on ties to accomplish its goals. (Comment: this effort might be easier if you let most of the points be collinear.) Show the step-by-step process of the clustering. Finally, Propose a solution in case of an empty cluster.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

The DirectCast cable TV company is a national chain that services the small college town that is home to Tech, with a student population of almost 30,000. The company has generally been able to...

Either: Prove that k-means will produce k clusters, allnonempty, OR: Give anexample of a set D of data points (with no repeated data point), avalue for k(k

means-hierarchical.pdf Adobe Acrobat Reader DC dow Help Assign2-Clustering The-Morgan-Kauf- abhi 3/4 100% 3. Either: Prove that k-means will produce k clusters, all nonempty, or: Give an example of a...

Please answer this asap need to submit it before 23rd feb 3. Either: Prove that k-means will produce k clusters, all nonempty, or: Give an example of a set D of data points (with no repeated data...

[35 pts] (source: Boots) In the problem, you will apply K-Means to image segmentation. Write a program named imageSegmentation.py that reads in an image, segments that image using K-Means clustering...

CANMNMM January of this year. (a) Each item will be held in a record. Describe all the data structures that must refer to these records to implement the required functionality. Describe all the...

In a Hopfield neural network configured as an associative memory, with all of its weights trained and fixed, what three possible behaviours may occur over time in configuration space as the net...

Jones & Bartlett Learning, LLC. NOT FOR RESALE OR DISTRIBUTION CHAPTER Hot Spot Analysis 10 LEARNING OBJECTIVES C A R R Provide a working definition of a \"hot spot.\" , Be able to explain different...

ANSI-SPARC6 Programming Language Compilation Write notes on each of the following topics: (a) the implementation of labels and jumps in a recursive, block structured programming language [7 marks]...

I keep having a "file not found in directory error" even though i have the file saved where the .py file is saved. please show how i am able to fix this error. Clustering is a process of identifying...

Read the above study and give summary conclusions 2. METHODOLOGY 2.1. K-means Clustering At the beginning of this paper, the author introduces that the matches took place in different Balkan cities,...

Contrast the market-value method and the equity method.

2Read Case 12 on pages 2223 of the textbook and answer the three questions related to the case Also explain how management communication would have to change for Kathy if she were to take the company...

Which enryme of glycolysis comverts dihydrory acetone phosphate to ghceraldelyde - 3 - phosphate? Wone phowhube iomerne Numendine Inclase Madoluse

You are evaluating a potential purchase of several light-duty trucks. The initial cost of the trucks will be $170,000. The trucks fall in the MACRS 5-year class that allows depreciation of 20% the...

(Appendices) Why do readers of financial statements prefer the separate disclosure of gross sales revenue and sales returns and allowances to the disclosure of a single net sales revenue amount? LO61

(Appendices) Name the two allowance methods used to compute uncollectible account expense. For each method, how is uncollectible account expense computed and what does the balance of the allowance...

(Appendices) INCOME EFFECTS OF UNCOLLECTIBLE ACCOUNTS. The credit manager and the accountant for Goldsmith Company are attempting to assess the effect on net income of writing off $100,000 of...