Question: Consider the data set shown in Table 7.13. Suppose we are interested in extracting the following association rule: {α1 ¤ Age ¤ α2, Play Piano
{α1 ¤ Age ¤ α2, Play Piano = Yes} {Enjoy Classical Music = Yes}
Table 7.13. Data set for Exercise 6.
.png)
To handle the continuous attribute, we apply the equal-frequency approach with 3, 4, and 6 intervals. Categorical attributes are handled by introducing as many new asymmetric binary attributes as the number of categorical values. Assume that the support threshold is 10% and the confidence threshold is 70%.
(a) Suppose we discretize the Age attribute into 3 equal-frequency intervals. Find a pair of values for α1 and α2 that satisfy the minimum support and minimum confidence requirements.
(b) Repeat part (a) by discretizing the Age attribute into 4 equal-frequency intervals. Compare the extracted rules against the ones you had obtained in part (a).
(c) Repeat part (a) by discretizing the Age attribute into 6 equal-frequency intervals. Compare the extracted rules against the ones you had obtained in part (a).
(d) From the results in part (a), (b), and (c), discuss how the choice of discretization intervals will affect the rules extracted by association rule mining algorithms.
Age Play Piano Enjoy Classical Music Yes Yes Yes Yes Yes No No Yes No No No No Yes Yes No No Yes No No Yes No Yes No Yes 14 19 21 29 39 41 47
Step by Step Solution
3.32 Rating (164 Votes )
There are 3 Steps involved in it
a 1 19 2 29 s 167 c 100 b No rule satisfies the support and con... View full answer
Get step-by-step solutions from verified subject matter experts
Document Format (1 attachment)
908-M-S-D-A (8674).docx
120 KBs Word File
