Question: Part 3 : Clustering : This part is concerned with the file: / DataMining / data / arff / UCI / credit - g .

Part

3

: Clustering : This part is concerned with the file:

/

DataMining

/

data

/

arff

/

UCI

/

credit

-

.

arff.

Clustering of the credit

-

g data of part

1 .

For this part use only the attributes duration, age, credit amount and job. The aim is to determine the number of clusters in the data and assess whether any of the clusters are meaningful.

1 .

Run the K

-

means clustering algorithm on this data for the following values of K:

1, 2, 3, 4, 5, 10, 20 .

Analyse the resulting clusters. What do you conclude? Provide your reasoning.

2 .

Choose a value of K and run the algorithm with different seeds. What is the effect of changing the seed? Provide your explanation.

3 .

Run the EM algorithm on this data with the default parameters and describe the output and your analysis.

4 .

The EM algorithm can be quite sensitive to whether the data is normalized or not. Use the Weka normalize filter

(

Preprocess

- - >

Filter

- - >

unsupervised

- - >

normalize

)

to normalize the numeric attributes. What difference does this make to the clustering runs? Provide your reasoning.

5 .

The algorithm can be quite sensitive to the values of minLogLikelihoodImprovementCV, minStdDev and minLogLikelihoodImprovementIterating, Explore the effect of changing these values. What do you conclude?

6 .

How many clusters do you think are in the data? Give a plain English language description of one of them.

7 .

Compare the use of K

-

means and EM for these clustering tasks. Which do you think is best? Why?

8 .

What golden nuggets did you find, if any? Report Length Up to one page.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Excel File Edit View Insert Format Tools Data Window Help AutoSave Part 2 homework 3 Home Insert Draw Page Layout Formulas Data Review View Developer Tell me Share Calibri (Hedyl 11 - A A 23 General...

1.Read sections I, II, IV, V, VI and VIII of the IDB document titled 'Financial Regulation in the English-Speaking Caribbean: Is it Helping or Hindering Microfinance?' which is located at...

1. Read sections I, II, IV, V, VI and VIII of the IDB document titled 'Financial Regulation in the English-Speaking Caribbean: Is it Helping or Hindering Microfinance?' which is located at...

i have an assignment of financial aacounting and it has 2 questions in 2 parts and one part has to be done in quickbooks. STUDENT NAME(S) STUDENT NO.(S) (PLEASE UNDERLINE YOUR FAMILY NAME ) FOR GROUP...

Hi how are you? Do you think you can help me answer these questions? ACC 693 Chapter 8 Quiz Note: The material contained within these pages is taken directly from your textbook, Computerized Auditing...

Part 1: What is the true positive rate for the classifications of the test set? Round to 3 decimal places. Part 1: What is the precision for the classifications of the test set? Round to 3 decimal...

QUESTION 1 Part 1: A middle-aged person is seeking a loan for a new automobile. The person has no savings account and owns their house. Is the person predicted to have their loan become overdue? Type...

please help me to find the answer for part 1, part3 and part4 Queensland University of Technology QUT Business School School of Accountancy AYB 339 Accountancy Capstone Integrated Case Study Semester...

Chapter 7:7.2 #2 (p. 168) Chapter 8:8.2 #3 (p. 205) Chapter 9:9.2 #1 (p. 225) a., b., and c. (for c., only answer the adjustable rate mortgage part Chapter 7 from Personal Finance was adapted by The...

i need some one toanswering these three questions. make sure to include apa format for the information pretty straight forward. Attached below are the chapters. Chapter 7: 7.2 #2 (p. 168) Chapter 8:...

In January 2020, Crispin was employed as a store manager by his employer that operated in the electronic device retailing industry. Upon his employment agreement, Crispin is entitled to buy two...

The reported pretax financial income of Mechado Company is P1,800,000 and current income tax rate is 30%. Assume the following differences between the financial income and taxable income for the...

The blank or the explicit cost such as legal expenses associated with corporate default

Can a well-written contract completely protect you against bad behavior by the other parties to the contract?