Consider the decision trees shown in Figure 4 3 Assume they are generated from a data set that contains 16 binary attributes and 3 classes, C1, C2, and C3 Compute the total description length of each decision tree according to the minimum description length principle (a) Decision tree with 7 errors (b) Decision tree with 4 errors Figure 4 3 Decision trees for Exercise 9 cent The total description length of a tree is given by Cost(tree, data) Cost(tree) Cost(data tree) cent Each internal node of the tree is encoded by the ID of the splitting attribute If there are m attributes, the cost of encoding each attribute is log2m bits cent Each leaf is encoded using the ID of the class it is associated with If there are k classes, the cost of encoding a class is log2 k bits cent Cost(tree) is the cost of encoding all the nodes in the tree To simplify the computation, you can assume that the total cost of the tree is obtained by adding up the costs of encoding each internal node and each leaf node cent Cost(data tree) is encoded using the classification errors the tree commits on the training set Each error is encoded by log2 n bits, where n is the total number of training instances Which decision tree is better, according to the MDL principle C, C

Question: Consider the decision trees shown in Figure 4.3. Assume they are generated from a data set that contains 16 binary attributes and 3 classes, C1,

Consider the decision trees shown in Figure 4.3. Assume they are generated from a data set that contains 16 binary attributes and 3 classes, C1, C2, and C3. Compute the total description length of each decision tree according to the minimum description length principle.

(a) Decision tree with 7 errors
(b) Decision tree with 4 errors Figure 4.3. Decision trees for Exercise 9.
€¢ The total description length of a tree is given by: Cost(tree, data) = Cost(tree) + Cost(data|tree).
€¢ Each internal node of the tree is encoded by the ID of the splitting attribute. If there are m attributes, the cost of encoding each attribute is log2m bits.
€¢ Each leaf is encoded using the ID of the class it is associated with. If there are k classes, the cost of encoding a class is log2 k bits.
€¢ Cost(tree) is the cost of encoding all the nodes in the tree. To simplify the computation, you can assume that the total cost of the tree is obtained by adding up the costs of encoding each internal node and each leaf node.
€¢ Cost(data|tree) is encoded using the classification errors the tree commits on the training set. Each error is encoded by log2 n bits, where n is the total number of training instances.
Which decision tree is better, according to the MDL principle?

C, C

Step by Step Solution

★★★★★

3.49 Rating (166 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock

a b Because there are 16 attributes the cost for each internal node in the decision tree is log2... View full answer

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Document Format (1 attachment)

908-M-S-D-A (8615).docx

120 KBs Word File

Students Have Also Explored These Related Statistics Questions!

Consider the decision tree shown in Figure 4.2. (a) Compute the generalization error rate of the tree using the optimistic approach. (b) Compute the generalization error rate of the tree using the...

Consider a variation of the PDC decision tree shown in Figure 4.9. The company must first decide whether to undertake the market research study. If the market research study is conducted, the outcome...

Data 4.2 on page 223 describes an experiment to study the effects of smiling on leniency in judging students accused of cheating. The full data are in Smiles. In Example 4.2 we consider hypotheses H...

Consider the decision trees shown in the figure below. Assume they are generated from a data set that contains 1 6 binary attributes and 3 classes, C 1 C 2 and C 3 Compute the total description...

C. | | C b) Decision tree with 11 errors a) Decision tree with 8 errors Consider the decision trees shown above. Assume they are generated from a data set that contains 32 binary attributes and 3...

home / study / engineering / computer science / computer science questions and answers / common elements in k in the national football league, when teams are seeded for the playoffs, ... Question:...

Is it possible for you to write the whole code for this and not just the structure outline? The actual specifications are listed here. DESCRIPTION: In this assignment you are required to implement...

Description: In this assignment you are required to implement prefix - free codes using binary trees with linked nodes. For this, you have to write the Java classes BinTree and TNode in the same...

In the National Football League, when teams are seeded for the playoffs, there is always the possibil- ity that two or more teams may have the same won-lost-tied record. There is an ordered,...

The preexponential and activation energy for the diffusion of iron in cobalt are 1.1 10-5 m2/s and 253,300 J/mol, respectively. At what temperature will the diffusion coefficient have a value of 2.1...

Find the best fitting line for population 2 as a function of time and compute r2. Consider the following data on the growth of two bacterial populations. Year Population 1 Population 2 100 119 168...

A survey of 1000 students found that 274 chose professional baseball team A as their favorite team. In a similar survey involving 760 students, 240 of them chose team A as their favorite. Compute a...

19. Good model? In justifying his choice of a model, a student wrote, I know this is the correct model because R2 = 99.4%. a) Is this reasoning correct? Explain. b) Does this model allow the student...

For the following exercises, use the churn data set available at the book series website. Normalize the numerical data and deal with the correlated variables. Generate the full set of decision rules...

For the following exercises, use the churn data set available at the book series website. Normalize the numerical data and deal with the correlated variables. Compare the two decision trees and...

For the following exercises, use the churn data set available at the book series website. Normalize the numerical data and deal with the correlated variables. Generate a C4.5-type decision tree.

a K-Roo Ltd., an Australian firm, has a USD 264 million payable in one year that it wants to hedge, and enters into a risk sharing arrangement with its supplier, Ala Co. The terms of the risk sharing...

12.The quantitative data set under consideration has roughly abell-shaped distribution. Apply the empirical rule to answer the following question. A quantitative data set has mean 21 and standard...

MAN:NON -SMOKER 135, MAN REGULAR SMOKER 38, MAN HEAVY SMOKER: 5 TOTAL = 178 ---- WOMAN NON SMOKER 187, WOMAN REGULAR SMOKER 21, WOMAN HEAVY SMOKER 7, TOTAL 215 TOTAL NON SMOKERS (MALE AND FEMALE),...