2. [10 marks] The k-means algorithm is widely used in cluster analysis for its simplicity. Since...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
2. [10 marks] The k-means algorithm is widely used in cluster analysis for its simplicity. Since sample means can be severely affected by outliers, one may want to replace sample means with sample medians as cluster centers and thus obtain the following k-means algorithm based on medians for univariate data (perhaps we can call it the "k-medians algorithm"): Given k initial cluster centers c₁,...,Ck for a sample 21,...,n, repeat the following two steps until cluster centers do not change: (1) Calculate dij = ₁-C₁| for i=1,...,n and j = 1,..., k. Classify x; into cluster m, if dim is the smallest of dil...., dik. (2) For j = 1,...,k, set c; = median(x) for all x; in cluster j. Implement the "k-medians algorithm" in an R. function which, given a univariate sample and an initial set of cluster centers (both in vectors), computes and returns the final cluster centers. Apply it to the variable eruptions (for eruption times of the well-known Old Faithful Geyser) in the data set faithful in R, using the following initial cluster centers, respectively: (a) (2, 4); (b) (2, 3, 4); (c) (2,3,4,5) 2. [10 marks] The k-means algorithm is widely used in cluster analysis for its simplicity. Since sample means can be severely affected by outliers, one may want to replace sample means with sample medians as cluster centers and thus obtain the following k-means algorithm based on medians for univariate data (perhaps we can call it the "k-medians algorithm"): Given k initial cluster centers c₁,...,Ck for a sample 21,...,n, repeat the following two steps until cluster centers do not change: (1) Calculate dij = ₁-C₁| for i=1,...,n and j = 1,..., k. Classify x; into cluster m, if dim is the smallest of dil...., dik. (2) For j = 1,...,k, set c; = median(x) for all x; in cluster j. Implement the "k-medians algorithm" in an R. function which, given a univariate sample and an initial set of cluster centers (both in vectors), computes and returns the final cluster centers. Apply it to the variable eruptions (for eruption times of the well-known Old Faithful Geyser) in the data set faithful in R, using the following initial cluster centers, respectively: (a) (2, 4); (b) (2, 3, 4); (c) (2,3,4,5)
Expert Answer:
Answer rating: 100% (QA)
a Final centers are 19830 and 43415 See the plot below showing ... View the full answer
Related Book For
Intermediate Accounting
ISBN: 978-1260481952
10th edition
Authors: J. David Spiceland, James Sepe, Mark Nelson, Wayne Thomas
Posted Date:
Students also viewed these accounting questions
-
The Payback method is widely used in capital budgeting because is its simple and does a good job of determining the correct accept/reject decision. 1. True 2. False
-
The molecule n-octylglucoside, shown here, is widely used in biochemical research as a nonionic detergent for "solubilizing" large hydrophobic protein molecules. What characteristics of this molecule...
-
The Heaviside function: is widely used in engineering applications. (See figure.) To print an enlarged copy of the graph, go to MathGraphs.com. Sketch the graph of each function by hand. (a) H(x) 2...
-
Plainbank has $10 million in cash and equivalents, $30 million in loans, and $15 in core deposits. a. Calculate the financing gap. b. What is the financing requirement? c. How can the financing gap...
-
1. Find the gravitational force between the sun and Pluto. 2. Explain why the gravitational force between the sun and Jupiter is greater than the gravitational force between the sun and the earth...
-
Golding Inc. just finished its second month of operations. Golding mass-produces integrated circuits. The following production information is provided for December: Units in process, December 1, 80%...
-
Explain the two major types of measure used in conventional accounting and ecological accounting. When consideration is given to environmental issues in accounting, what are the two main groups of...
-
Shown on the next page are comparative income statements for McDonald's for 2005, 2006, and 2007. 1. There are two kinds of McDonald's restaurants'restaurants that McDonald's itself owns and...
-
(a) First perform a multiple regression with all the variables, what can you say about the significance of the variables based on only the p-values. Next use the "step" function to perform backward...
-
Consider a standard 52 card deck of playing cards. In total there are four cards that are Aces, four cards that are Kings, four cards that are Queens and four cards that are Jacks. The remaining 36...
-
A summary of cash flows for Pickerel Consulting Group for the year ended March 31, 2010, is shown below. Cash receipts: Cash received from customers . . . . . . . . . . . . . . . . . . . . . $239,100...
-
Dr. Jim performed an appendectomy on Alison Thursday morning. Everything went well and Alison seemed to be recovering as expected. Saturday morning, she woke with horrible pain and an x-ray revealed...
-
Kraft Heinz is closing its plant in St. Marys, Ontario, believed to employ some 200 people. In 2013, Heinz closed its century-old plant in Leamington, Ontario, laying off around 740 people. After...
-
Hotel capacity is limited in Noshuille so even a 2-star hotel can sell its standard room at the high price of $300. The hotel has 120 rooms and the distribution of the number of x of no shows is. a 3...
-
The following statement of financial position has been prepared for Kareena Beauty Sdn. Bhd for TWO (2) consecutive years. Statement of Financial Position as at 31 March 2014 and 31 March 2015 2015...
-
On December 1, 2023, after running out of Apple iPhones in all of its 50 Southern California stores, Metta, a district manager from Best Buy, telephones James, the regional director of sales from...
-
Entrepreneurs are always looking for unique opportunities to fill needs or wants.* A. TRUE B. FALSE The easiest step in the creative endeavor is the implementation and evaluation phase.* A. TRUE B....
-
Use critical values to test the null hypothesis H0: 1 2 = 20 versus the alternative hypothesis H0: 1 2 20 by setting a equal to .10, .05, .01, and .001. How much evidence is there that the...
-
Long-term obligations usually are reclassified and reported as current liabilities when they become payable within the upcoming year (or operating cycle, if longer than a year). So, a 25-year bond...
-
Johns Specialty Store uses a periodic inventory system. The following are some inventory transactions for the month of May: 1. Johns purchased merchandise on account for $5,000. Freight charges of...
-
Performance and profitability of a company often are evaluated using the financial information provided by a firm's annual report in comparison with other firms in the same industry. Ratios are...
-
How do you allocate requirements?
-
How do you flow down requirements?
-
What is requirements flow down?
Study smarter with the SolutionInn App