Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine, similar to how some people invest in rare coins and fine art. To educate herself on the properties of fine wine, she has collected data on 13 different characteristics of 178 wines. Amanda has applied k-means clustering to this data for k = 1, ..., 10 and generated the following plot of total sums of squared deviations. After analyzing this plot, Amanda generates summaries for k = 2, 3, and 4. Which value of k is the most appropriate to categorize these wines? Justify your choice with calculations. Sum of WithinSS 500 1000 1500 2000 0 k = 2 Cluster 1 Cluster 2 Cluster 11 Cluster 2 Total k = 3 Cluster 1 Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3 Total k = 4 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Sum of WithinSS Over Number of Clusters +-- T 2 Cluster 1 0 5.640 Size 87 91 178 Cluster 1 0 5.147 6.078 Cluster 1 0 5.255 6.070 4.853 Size 81 62 65 51 178 4 O Inter-Cluster Distances Within-Cluster Summary Number of Clusters. Sum(WithinSS) Diff previous Sum(WithinSS) X----x-- 0 5.432 T 6 Cluster 2 5.640 0 Average Distance 3.355 3.999 3.483 3.627 Inter-Cluster Distances Cluster 2 5.147 Within-Cluster Summary Average Distance 4.003 4.260 4.134 Inter-Cluster Distances Cluster 2 Cluster 3 5.255 0 5.136 4.789 6.070 5.136 0 6.074 11 T 8 Cluster 3 6.078 5.432 0 Cluster 4 4.853 4.789 6.074 0 10 O X---- X T 10 Cluster 2 Cluster 3 Cluster 4 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Total k = 2 5.255 6.070 4.853 k = 3 0 5.136 4.789 Within-Cluster Summary Average Distance Size 56 45 49 28 178 k = 4 3.024 3.490 3.426 4.580 3.498 Do not round intermediate calculations. If required, round your answers to two decimal places. 5.136 0 6.074 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Average = Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance = Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance = Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance = Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance = Average = 4.789 6.074 0 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance = Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance = Cluster 1 to Cluster 4 Distance / Cluster 1 Average Distance = Cluster 4 to Cluster 1 Distance / Cluster 4 Average Distance = Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance = Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance = Cluster 2 to Cluster 4 Distance / Cluster 2 Average Distance = Cluster 4 to Cluster 2 Distance / Cluster 4 Average Distance = Cluster 3 to Cluster 4 Distance / Cluster 3 Average Distance = Cluster 4 to Cluster 3 Distance / Cluster 4 Average Distance = Average = Based on the individual ratio values and the average ratio values for each value of k, it appears that Select your answer is the best clustering. Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine, similar to how some people invest in rare coins and fine art. To educate herself on the properties of fine wine, she has collected data on 13 different characteristics of 178 wines. Amanda has applied k-means clustering to this data for k = 1, ..., 10 and generated the following plot of total sums of squared deviations. After analyzing this plot, Amanda generates summaries for k = 2, 3, and 4. Which value of k is the most appropriate to categorize these wines? Justify your choice with calculations. Sum of WithinSS 500 1000 1500 2000 0 k = 2 Cluster 1 Cluster 2 Cluster 11 Cluster 2 Total k = 3 Cluster 1 Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3 Total k = 4 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Sum of WithinSS Over Number of Clusters +-- T 2 Cluster 1 0 5.640 Size 87 91 178 Cluster 1 0 5.147 6.078 Cluster 1 0 5.255 6.070 4.853 Size 81 62 65 51 178 4 O Inter-Cluster Distances Within-Cluster Summary Number of Clusters. Sum(WithinSS) Diff previous Sum(WithinSS) X----x-- 0 5.432 T 6 Cluster 2 5.640 0 Average Distance 3.355 3.999 3.483 3.627 Inter-Cluster Distances Cluster 2 5.147 Within-Cluster Summary Average Distance 4.003 4.260 4.134 Inter-Cluster Distances Cluster 2 Cluster 3 5.255 0 5.136 4.789 6.070 5.136 0 6.074 11 T 8 Cluster 3 6.078 5.432 0 Cluster 4 4.853 4.789 6.074 0 10 O X---- X T 10 Cluster 2 Cluster 3 Cluster 4 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Total k = 2 5.255 6.070 4.853 k = 3 0 5.136 4.789 Within-Cluster Summary Average Distance Size 56 45 49 28 178 k = 4 3.024 3.490 3.426 4.580 3.498 Do not round intermediate calculations. If required, round your answers to two decimal places. 5.136 0 6.074 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Average = Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance = Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance = Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance = Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance = Average = 4.789 6.074 0 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance = Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance = Cluster 1 to Cluster 4 Distance / Cluster 1 Average Distance = Cluster 4 to Cluster 1 Distance / Cluster 4 Average Distance = Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance = Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance = Cluster 2 to Cluster 4 Distance / Cluster 2 Average Distance = Cluster 4 to Cluster 2 Distance / Cluster 4 Average Distance = Cluster 3 to Cluster 4 Distance / Cluster 3 Average Distance = Cluster 4 to Cluster 3 Distance / Cluster 4 Average Distance = Average = Based on the individual ratio values and the average ratio values for each value of k, it appears that Select your answer is the best clustering.
Expert Answer:
Answer rating: 100% (QA)
K 2 Cluster 1 to Cluster 2 Distance Cluster 1 Average Distance 56404003 141 Cluster 2 to Cluster 1 D... View the full answer
Related Book For
Essentials Of Business Analytics
ISBN: 9781337406420
3rd Edition
Authors: Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W. Ohlmann, David R. Anderson, Dennis J. Sweeney, Thomas A. Williams
Posted Date:
Students also viewed these accounting questions
-
Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine,...
-
What effects does recrystallization have on the properties of metals?
-
Stephanie Albini, chief risk manager of Action Park, Illinois, has collected data on the number of accidents occurring on city property over the last 72 months. The mayor of Action Park thinks...
-
9. A molybdenum-vanadium alloy of composition 50wt%Mo - 50wt%V is slowly cooled from a temperature of 2600C to 1800C. Determine: a) At what temperature does the first solid phase form? b) What is the...
-
Consider two countries with the following characteristics. Country A has no restrictions on bank branching and banks in Country A are permitted to offer investment and insurance products along with...
-
Goode Manufacturing Company has the following production data for selected months. Compute the physical units for each month. Ending Work in Process % Complete as to Conversion Cost Units Transferred...
-
Refer to the information in Exercise 16-14. Prepare journal entries dated June 30 to record: (a) raw materials purchases, (b) direct materials usage, (c) indirect materials usage, (d) direct labor...
-
Journal entries, T-accounts and source documents. Production Company produces gadgets for the coveted small appliance market. The following data reflects activity for the year 2008. Production Co....
-
8. Draw and label a potential graph and contour plot for: a. A point charge of 1nC. b. A parallel-plate capacitor with AV = 2.0V 9. The electric potential generated by a particular point charge is...
-
Consolidation related simulation example: Millennium Capital Management, Inc., (MCM) acquired a 90% interest in NextGen, Inc. MCM's Financial Manager, Matthew Steven, has prepared a draft memo to the...
-
Which of the following statements is NOT correct regarding global marketing? A . Global companies often form strategic alliances with other companies around the world. B . Global marketing is the...
-
What is the test which distinguishes between tax avoidance and tax evasion?
-
Why is income-shifting considered such a major tax planning concept?
-
What kind of taxpayer errors could be solved by mail?
-
What is meant by the term tax gap?
-
Discuss the badges of fraud.
-
Benny is a partner in the BEN partnership. His outside basis is $275. He receives a distribution of $75 in cash: 1) If the distribution is taxable to Benny indicate the amount? 2) What is Benny's...
-
What is beacon marketing? What are digital wallets?
-
A European put option on a currency allows you to sell a unit of that currency at the specified strike price (exchange rate) at a particular point in time after the purchase of the option. For...
-
Young entrepreneur Fan Bingbing has launched a business venture in which she uses stories submitted by university students as the basis for comics in a monthly animestyle magazine. Based on market...
-
The regulation of electric and gas utilities is an important public policy question affecting consumers choice and cost of energy provider. To inform deliberation on public policy, data on eight...
-
When an organization decides to replace a legacy system, it usually chooses a contemporary database system over a relational file-based system. But each type of system has its own advantages and...
-
Complete the following sentences: system is also called a distributed data and application client/server system or
-
The textbook states that "data is a resource that must be controlled and managed." Explain this statement, and indicate whether you agree of not, and why.
Study smarter with the SolutionInn App