Question: (a) Consider the data set shown in Table 7.4. Suppose we apply the following discretization strategies to the continuous attributes of the data set. D1:
D1: Partition the range of each continuous attribute into 3 equal-sized bins.
D2: Partition the range of each continuous attribute into 3 bins; where each bin contains an equal number of transactions
For each strategy, answer the following questions:
i. Construct a binarized version of the data set.
ii. Derive all the frequent itemsets having support ¥ 30%.
Table 7.4. Data set for Exercise 2.
.png)
(b) The continuous attribute can also be discretized using a clustering approach.
i. Plot a graph of temperature versus pressure for the data points shown in Table 7.4.
ii. How many natural clusters do you observe from the graph? Assign a label (C1, C2, etc.) to each cluster in the graph.
iii. What type of clustering algorithm do you think can be used to identify the clusters? State your reasons clearly.
v. Derive all the frequent itemsets having support ¥ 30% from the binarized data.
TID Temperature Pressure AlaAlarm 2 Alarm 3 1105 1040 1090 1084 1038 1080 1025 1030 1100 103 100 2 101
Step by Step Solution
3.43 Rating (166 Votes )
There are 3 Steps involved in it
Table 75 shows the discretized data using D1 where the discretized intervals are X1 Temperature betw... View full answer
Get step-by-step solutions from verified subject matter experts
Document Format (1 attachment)
908-M-S-D-A (8667).docx
120 KBs Word File
