Question: 1. Textbook (Data Mining Concepts and Techniques), Page 34, 1.3. Dene each of the following data mining functionalities: characterization, discrimination, asso- ciation and correlation analysis,

1. Textbook (Data Mining Concepts and Techniques), Page 34, 1.3.

Dene each of the following data mining functionalities: characterization, discrimination, asso-

ciation and correlation analysis, classication, regression, clustering, and outlier analysis. Give

examples of each data mining functionality, using a real-life database that you are familiar with.

Use your own words to decribe the examples in detail.

2. Textbook (Data Mining Concepts and Techniques), Page 81, 2.6

Given two objects represented by the tuples (22; 1; 42; 10) and (20; 0; 36; 8):

(a) Compute the Euclidean distance between the two objects.

(b) Compute the Manhattan distance between the two objects.

(c) Compute the Minkowski distance between the two objects, using q = 3.

(d) Compute the supremum distance between the two objects.

3. Textbook (Data Mining Concepts and Techniques), Page 121, 3.7

Using the data for age given in Exercise 3.3, answer the following:

(a) Use min-max normalization to transform the value 35 for age onto the range [0.0, 1.0].

(b) Use z-score normalization to transform the value 35 for age, where the standard deviation of

age is 12.94 years.

(c) Use normalization by decimal scaling to transform the value 35 for age.

(d) Comment on which method you would prefer to use for the given data, giving reasons as to why.

f) Use zscore normalization by the mean absolute deviation instead of standard deviation to

transform the value 35 for age. Comment on comparing these two methods: z-score normalization

and z-score normalization using mean absolute deviation).

Hint: z-score normalization using mean absolute deviation is dened on page 114 and 115, equa-

tions 3.10 and 3.11.

4. Textbook (Data Mining Concepts and Techniques), Page 273, 6.6

A database has ve transactions. Let min sup = 60% and min conf = 80%.

TID items bought

T100 M,O,N,K,E,Y

T200 D,O,N,K,E,Y

T300 M,A,K,E

T400 M,U,C,K,Y

T500 C,O,O,K,I,E

(a) Find all frequent itemsets using Apriori and FP-growth, respectively. Compare the eciency

of the two mining processes.

(b) List all the strong association rules (with support s and condence c) matching the following

metarule, where X is a variable representing customers, and itemi denotes variables repre-

senting items (e.g., \A," \B,"):

8x 2 transaction, buys(X, item1^ buys(X; item2) ) buys(X; item3) [s, c]

5. Given the training examples below, implement classication using frequent patterns to determine

when Outlook = Rain; Temperature = Hot;Humidity = High;Wind = Weak; PlayTennis =?

Day Outlook Temperature Humidity Wind PlayTennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No

D3 Overcast Hot High Weak Yes

D4 Rainy Mild High Weak Yes

D5 Rainy Cool Normal Weak Yes

D6 Rainy Cool Normal Strong No

D7 Overcast Cool Normal Strong Yes

D8 Sunny Mild High Weak No

D9 Sunny Cool Normal Weak Yes

D10 Rainy Mild Normal Weak Yes

D11 Sunny Mild Normal Strong Yes

D12 Overcast Mild High Strong Yes

D13 Overcast Hot Normal Weak Yes

D14 Rainy Mild High Strong No

ii

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!