Consider the decision trees shown in Figures (a) and (b). For each approach described below, you...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Consider the decision trees shown in Figures (a) and (b). For each approach described below, you need to compute the generalization errors for both trees and decide which tree is better. The training and validation data sets are shown in Figures (c) and (d), respectively. + 0 B A A 0 C B 1 0 1 0 + + B 0 + (a) Decision Tree A (b) Decision Tree B #123 0 3 OOOD A 0 B00 C Class 0 0 + 11 0 0 + 12 0 1 0 - 13 4567 0 1 1 14 1 0 0 + 15 #12345 AO A 0 BO B 0 00 C Class 0 + 0 0 1 - 0 1 0 - 0 1 1 - 0 1 1 + 1 0 1 + 16 1 0 0 - 1 0 1 - 17 1 0 0 8 1 1 0 - 18 1 0 1 - 90 1 1 0 - 19 1 1 0 - 10 1 1 1 + 20 1 1 1 + (c) Training data (c) Validation data (a) Optimistic approach (assumes generalization error is given by the training error). (b) Reduced error pruning approach (generalization error is computed using the validation set shown in Figure (d)), which means generalization error = validation error. (c) Pessimistic error approach, where generalization error = training error + omega * complexity of the tree. Consider omega = 1.0 and complexity of the tree = number of leaves / total number of training points. (d) minimum description length (MDL) approach. The total description length of a tree is given by: Cost(tree, data)=Cost(tree) + Cost(data tree); Each internal node of the tree is encoded by the ID of the splitting attribute. If there are m attributes, the cost of encoding each attribute is log2m bits. Each leaf node is encoded using the ID of the class it is associated with. If there are k classes, the cost of encoding a class is log2 k bits. Cost(tree) is the cost of encoding all the nodes in the tree. To simplify the computation, you can assume that the total cost of the tree is obtained by adding up the costs of encoding each internal node and each leaf node. Cost(data tree) is encoded using the classification errors the tree commits on the training set. Each error is encoded by log2 n bits, where n is the total number of training examples. Consider the decision trees shown in Figures (a) and (b). For each approach described below, you need to compute the generalization errors for both trees and decide which tree is better. The training and validation data sets are shown in Figures (c) and (d), respectively. + 0 B A A 0 C B 1 0 1 0 + + B 0 + (a) Decision Tree A (b) Decision Tree B #123 0 3 OOOD A 0 B00 C Class 0 0 + 11 0 0 + 12 0 1 0 - 13 4567 0 1 1 14 1 0 0 + 15 #12345 AO A 0 BO B 0 00 C Class 0 + 0 0 1 - 0 1 0 - 0 1 1 - 0 1 1 + 1 0 1 + 16 1 0 0 - 1 0 1 - 17 1 0 0 8 1 1 0 - 18 1 0 1 - 90 1 1 0 - 19 1 1 0 - 10 1 1 1 + 20 1 1 1 + (c) Training data (c) Validation data (a) Optimistic approach (assumes generalization error is given by the training error). (b) Reduced error pruning approach (generalization error is computed using the validation set shown in Figure (d)), which means generalization error = validation error. (c) Pessimistic error approach, where generalization error = training error + omega * complexity of the tree. Consider omega = 1.0 and complexity of the tree = number of leaves / total number of training points. (d) minimum description length (MDL) approach. The total description length of a tree is given by: Cost(tree, data)=Cost(tree) + Cost(data tree); Each internal node of the tree is encoded by the ID of the splitting attribute. If there are m attributes, the cost of encoding each attribute is log2m bits. Each leaf node is encoded using the ID of the class it is associated with. If there are k classes, the cost of encoding a class is log2 k bits. Cost(tree) is the cost of encoding all the nodes in the tree. To simplify the computation, you can assume that the total cost of the tree is obtained by adding up the costs of encoding each internal node and each leaf node. Cost(data tree) is encoded using the classification errors the tree commits on the training set. Each error is encoded by log2 n bits, where n is the total number of training examples.
Expert Answer:
Related Book For
Introduction to Data Mining
ISBN: 978-0321321367
1st edition
Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar
Posted Date:
Students also viewed these algorithms questions
-
Consider the decision trees shown in Figure 4.3. Assume they are generated from a data set that contains 16 binary attributes and 3 classes, C1, C2, and C3. Compute the total description length of...
-
"internet radios" for streaming audio, and personal video recorders and players. Describe design and evaluation processes that could be used by a start-up company to improve the usability of such...
-
State whether or not each of the following events would result in a liability being recognised in the accounts at 30 June. 1. Taxes for the year ended 30 June, which are not payable until October. 2....
-
Given the information for Schism, Inc., in Problems 10 and 11, suppose you also know that the firm's net capital spending for 2016 was $705,000, and that the firm reduced its net working capital...
-
Evaluate each geometric sum. k=0 3 4 k
-
Petitioner Christy Brzonkala met respondents Antonio Morrison and James Crawford at a campus party at Virginia Polytechnic Institute (Virginia Tech), where they were all students. At the party, the...
-
Answer the following multiple-choice questions: a. Which of the following could lead to cash flow problems? 1. Tightening of credit by suppliers 2. Easing of credit by suppliers 3. Reduction of...
-
1. [10] Is Grtzsch graph M(C5) Hamiltonian? Is the complement of M(C5) Hamilto- nian? Justify your answer.
-
Microhard has issued a bond with the following characteristics: Par: $1,000 Time to maturity: 8 years Coupon rate: 11 percent Semiannual payments Calculate the price of this bond if the YTM is (Do...
-
6. Paid advertising and additional strategies o Describe your paid advertising plan o Describe how you will use any additional online marketing strategies to promote your business
-
Determine the present value of $120,000 to be received at the end of each of 4 years, using an interest rate of 5.5%, compounded annually, as follows: a. By successive computations, using the present...
-
Oliver and Tomoko incorporate Owl Two, Inc. ("Owl"). Oliver contributes a truck (FMV-$11,000, Basis - $15,000, and subject to an outstanding loan of $8,000 which Owl will assume) in exchange for...
-
How do social movements utilize strategies of resistance and collective action to challenge prevailing systems of oppression and advocate for social change on both local and global scales?"
-
Actual investment equals planned investment: O times unplanned investment minus inventory investment. plus unplanned investment. Ominus unplanned investment. plus unplanned investment plus inventory...
-
Discuss each of the economic ideas: People are rational, people respond to incentives, and optimal decisions are made at the margin. Please give an example of each.
-
Is it ethical to provide safety training in English to immigrant workers who speak little English, in order to reduce costs?
-
Distinguish between noise and outliers. Be sure to consider the following questions. (a) Is noise ever interesting or desirable? Outliers? (b) Can noise objects be outliers? (c) Are noise objects...
-
Traditional agglomerative hierarchical clustering routines merge two clusters at each step. Does it seem likely that such an approach accurately captures the (nested) cluster structure of a set of...
-
Consider the one-dimensional data set shown in Table 5.4. (a) Classify the data point x = 5.0 according to its 1-, 3-, 5-, and 9-nearest neighbors (using majority vote). (b) Repeat the previous...
-
How is the legal infrastructure of a country significant for business?
-
What is legal liability?
-
We have a parliament to pass laws, a government to administer laws, and a police department to enforce laws. Ironically, these potent instruments for the restriction of liberty are necessary for the...
Study smarter with the SolutionInn App