Question: Given a data set with five transactions, each containing five items, as shown in the table. TID items_bought T1 {A, H, K, T, X} T2

Given a data set with five transactions, each containing five items, as shown in the table.

TID items_bought
T1 {A, H, K, T, X}
T2 {A, H, X, T, Z}
T3 {A, B, D, R, S}
T4 {B, H, S, T, X}
T5 {B, H, G, M, S}

(a) What is the maximum number of possible frequent itemsets?

b) Let min_support = 50%. Find all frequent itemsets using the Apriori algorithm. Your answer should include the key steps of the computation process.

(c) In the computation (b) above, how many rounds of database scan are needed? What is the total number of candidates?

(d) Let n be the total number of transactions, b be the number of items in each transaction, m be the number of k-itemset candidates. Consider the following two different approaches for counting the support values of the candidates. For each transaction, the first approach checks if a candidate occurred in the transaction or not; the second approach enumerates all the possible k-itemsets of the transaction and checks if the itemset is one of the candidates. What is the computation complexity for each approach? Is one always better than the other?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!