(a) Consider the data set shown in Table 7.4. Suppose we apply the following discretization strategies to...
Question:
D1: Partition the range of each continuous attribute into 3 equal-sized bins.
D2: Partition the range of each continuous attribute into 3 bins; where each bin contains an equal number of transactions
For each strategy, answer the following questions:
i. Construct a binarized version of the data set.
ii. Derive all the frequent itemsets having support ¥ 30%.
Table 7.4. Data set for Exercise 2.
(b) The continuous attribute can also be discretized using a clustering approach.
i. Plot a graph of temperature versus pressure for the data points shown in Table 7.4.
ii. How many natural clusters do you observe from the graph? Assign a label (C1, C2, etc.) to each cluster in the graph.
iii. What type of clustering algorithm do you think can be used to identify the clusters? State your reasons clearly.
v. Derive all the frequent itemsets having support ¥ 30% from the binarized data.
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Introduction to Data Mining
ISBN: 978-0321321367
1st edition
Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar
Question Posted: