Question: 1. Textbook (Data Mining Concepts and Techniques), Page 34, 1.3. Dene each of the following data mining functionalities: characterization, discrimination, asso- ciation and correlation analysis,
1. Textbook (Data Mining Concepts and Techniques), Page 34, 1.3.
Dene each of the following data mining functionalities: characterization, discrimination, asso-
ciation and correlation analysis, classication, regression, clustering, and outlier analysis. Give
examples of each data mining functionality, using a real-life database that you are familiar with.
Use your own words to decribe the examples in detail.
2. Textbook (Data Mining Concepts and Techniques), Page 81, 2.6
Given two objects represented by the tuples (22; 1; 42; 10) and (20; 0; 36; 8):
(a) Compute the Euclidean distance between the two objects.
(b) Compute the Manhattan distance between the two objects.
(c) Compute the Minkowski distance between the two objects, using q = 3.
(d) Compute the supremum distance between the two objects.
3. Textbook (Data Mining Concepts and Techniques), Page 121, 3.7
Using the data for age given in Exercise 3.3, answer the following:
(a) Use min-max normalization to transform the value 35 for age onto the range [0.0, 1.0].
(b) Use z-score normalization to transform the value 35 for age, where the standard deviation of
age is 12.94 years.
(c) Use normalization by decimal scaling to transform the value 35 for age.
(d) Comment on which method you would prefer to use for the given data, giving reasons as to why.
f) Use zscore normalization by the mean absolute deviation instead of standard deviation to
transform the value 35 for age. Comment on comparing these two methods: z-score normalization
and z-score normalization using mean absolute deviation).
Hint: z-score normalization using mean absolute deviation is dened on page 114 and 115, equa-
tions 3.10 and 3.11.
4. Textbook (Data Mining Concepts and Techniques), Page 273, 6.6
A database has ve transactions. Let min sup = 60% and min conf = 80%.
TID items bought
T100 M,O,N,K,E,Y
T200 D,O,N,K,E,Y
T300 M,A,K,E
T400 M,U,C,K,Y
T500 C,O,O,K,I,E
(a) Find all frequent itemsets using Apriori and FP-growth, respectively. Compare the eciency
of the two mining processes.
(b) List all the strong association rules (with support s and condence c) matching the following
metarule, where X is a variable representing customers, and itemi denotes variables repre-
senting items (e.g., \A," \B,"):
8x 2 transaction, buys(X, item1^ buys(X; item2) ) buys(X; item3) [s, c]
5. Given the training examples below, implement classication using frequent patterns to determine
when Outlook = Rain; Temperature = Hot;Humidity = High;Wind = Weak; PlayTennis =?
Day Outlook Temperature Humidity Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rainy Mild High Weak Yes
D5 Rainy Cool Normal Weak Yes
D6 Rainy Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rainy Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rainy Mild High Strong No
ii
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
