Question: Consider a labeled data set containing 100 data instances, which is randomly partitioned into two sets A and B, each containing 50 instances. We use

Consider a labeled data set containing 100 data instances, which is randomly partitioned into two sets A and B, each containing 50 instances. We use A as the training set to learn two decision trees, T10 with 10 leaf nodes and T100 with 100 leaf nodes. The accuracies of the two decision trees on data sets A and B are shown in Table 3.7.

Consider a labeled data set containing 100 data instances, which is randomly (a) Based on the accuracies shown in Table 3.7, which classification model would you expect to have better performance on unseen instances? (b) Now, you tested T10 and T100 on the entire data set (A + B) and found that the classification accuracy of T10 on data set (A+B) is 0.85, whereas the classification accuracy of T100 on the data set (A + B) is 0.87. Based on this new information and your observations from Table 3.7, which classification model would you finally choose for classification?

Table 3.7. Comparing the test accuracy of decision trees Tio and T1oo. Accuracy Data SetTioTioo 100 0.86 0.97 0.84 0.77

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Consider a labeled data set containing 100 data instances which are randomly partitioned into two sets A and B, each containing 50 instances. We use A as the training set to learn two decision trees...

Your question 1. 50 pts. From Tan et al. text Exercise 3.3. Learning objective understand informational entropy. Consider the table below for a binary classification problem. Instance A1 A2 A3 Target...

1. 50 pts. From Tan et al. text Exercise 3.3. Learning objective understand informational entropy. Consider the table below for a binary classification problem. Instance A1 A2 A3 Target Class 1 T T...

From tan et al text Exercise 3.12. Learning objective is to show understanding of classifier performance analysis. Consider a labeled data set containing 100 data instances, which is randomly...

12. Consider a labeled data set containing 100 data instances, which is randomly partitioned into two sets A and B, each containing 50 instances. We use A as the training set to learn two decision...

27) Once the categories of data have been specified, it is usually followed by the following step: Group of answer choices The allocation of code numbers to each category. The compilation of a...

Please use python to write this program. Part A. k Nearest Neighbor (kNN) Supervised Learner (40 points) Write a program that performs supervised classification using the kN N algorithm which assigns...

PLEASE READ ALL THE LOCKED COMMENTS CAREFULLY BEFORE SUBMITTING ANY ANSWERS. N IF YOU DO NOT READ THESE COMMENTS, YOU ARE LIKELY TO OVERLOOK AN IMPORTANT DETAIL W OF THE REQUIRED SUBMISSION STYLE. 4...

In its fifth year of operations, Shocker, Inc., a C corporation, had current earnings and profits of $49,000. At the end of its fourth year of operations, accumulated earnings and profits were...

The plant manager of Taiwan Electronics Company is considering the purchase of new automated assembly equipment. The new equipment will cost $ 1,400,000. The manager believes that the new investment...

Which of the following should NOT influence a firm's dividend policy decision? Question 4 options: A strong preference by most shareholders for current cash income versus capital gains The firm's...

The manager of Quantitative International Fund uses EAFE as a benchmark. Last year's performance for the fund and the benchmark were as follows: EAFE weight return on equity index E1/E0-1 currency...

6. Are my sources reliable?

How do you feel about the fact that even unintentionally using someone elses words, ideas, or intellectual property is still plagiarism? Does it seem unfair that you might suffer severe consequences...

5. Are my sources compelling?