Question: Consider a labeled data set containing 100 data instances which are randomly partitioned into two sets A and B, each containing 50 instances. We use

Consider a labeled data set containing 100 data instances which are randomly partitioned into two sets A and B, each containing 50 instances. We use A as the training set to learn two decision trees T ₁₀ with 10 leaf nodes and T ₁₀₀ with 100 leaf nodes. The accuracies of the two decision trees on data sets A and B are shown below:

Data Set T10 0.86 T300 0.97 0.77 0.84

(a) Based on the accuracies shown in the table above, which classification model would you expect to have better performance on unseen instances?

(b) Now you've tested T ₁₀ and T ₁₀₀ on the entire dataset (A + B) and found that the classification accuracy of T ₁₀ on the data set (A + B) is 0.85, whereas the classification accuracy of T ₁₀₀ on the data set (A + B) is 0.87. Based on this new information and your observations from the table above, which classification model would you finally choose for classification?

Data Set T10 0.86 T300 0.97 0.77 0.84

Step by Step Solution

★★★★★

3.52 Rating (145 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock

To address your questions lets analyze the information provided a Based on the accuracies shown in t... View full answer

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Which country would you expect to have a higher rate of investment: A catch-up country or a cutting-edge country?

Which of these compounds would you expect to have the highest boiling point? Explain. [Section 24.4] CH3CH CH CH OH CHC=CH HCOCH

Which liquid would you expect to have a greater viscosity, water or diethyl ether? The structure of diethyl ether is shown in Problem 11.14?

Location Income ($1,000) Urban 27 Rural 25 Suburban 25 Suburban 26 Rural 30 Urban 29 Rural 33 Urban 30 Suburban 32 Urban 34 Urban 35 Urban 40 Rural 30 Rural 33 Urban 42 Suburban 32 Urban 43 Urban 43...

\fThis is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed. Editorial review has deemed that any suppressed content does...

Follow the steps given in Machine Learning With R , Chapter 5, section "Example Identifying Risky Bank Loans Using C5.0 Decision Trees." download the credit. csv file from Packt Publishing's website...

ID Salary Compa Midpoint Age 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 60.8 26.1 35.4 61.3 46.8 74 41.8 22.5 82.5 23 24.1 65.1 42 23.9...

ID Salary Compa Midpoint Age 4 5 6 7 8 9 10 11 12 13 14 15 19 20 21 22 26 27 30 31 32 43 44 49 50 1 2 3 16 17 18 23 24 25 28 29 33 34 35 36 37 38 39 40 41 42 45 46 47 48 75.4 72 46 26.9 59.8 65.4...

The Journal of Forensic Psychiatry & Psychology Vol. 21, No. 1, February 2010, 1-22 RESEARCH ARTICLE Condence and accuracy in assessments of short-term risks presented by forensic psychiatric...

Circle Corp. is considering opening a branch in another state. The operating cash flow will be $197,400 a year. The project will require new equipment costing $541,000 that would be depreciated on a...

Given the following information, write the equation of an ellipse. 11. Vertices: (0, 4) and (0, -4); Foci: (0, 23) and (0, -23) 12. Vertices: (0, 5) and (0, -5); Co-vertices: (1, 0) and (-1, 0) 13....

Imagine you are running an online retail business. How would you protect your customer's privacy? Why is it important? Now, think about how you can protect your business and customers from cyber...

According to Wikipedia (Right-to-work Law): A "right-to-work" law is a statute in the United States that prohibits union security agreements, or agreements between labor unions and employers, that...

7. IP A solenoid with 385 turns per meter and a diameter of 17.0 cm has a magnetic flux through its core of magnitude . (a) Find the current in this solenoid. (b) How would your answer to part (a)...

Peruse the website for companies in manufacturing that you feel would use process costing. Go to the company's website and review their annul report. Required: You are to assume the role of a Manager...

Which of the following is considered a depository financial institution?

Which of the following is NOT a magnetic dipole when viewed from far away? a) A permanent bar magnet. b) Several circular loops of wire closely stacked together with the same current running in each...

Solve the following linear program using SIMPLEX: maximize X1 2x2 subject to 4 X1 + 2x2 -2x1 2

Suppose we allow the pattern P to contain occurrences of a gap character???that can match an arbitrary string of characters (even one of zero length). For example, the pattern ab???ba???c occurs in...

Let M(n) be the time to multiply two n n matrices, and let L(n) be the time to compute the LUP decomposition of an n n matrix( Show that multiplying matrices and computing LUP decompositions of...

8. Which area is the main source of input to the cerebral cortexpg105 The drug phenylephrine is sometimes prescribed for people suffering from a sudden loss of blood pressure or other medical...

7. Of the following, which are in the hindbrain, which in the midbrain, and which in the forebrain: basal ganglia, cerebellum, hippocampus, hypothalamus, medulla, pituitary gland, pons, substantia...

12. Which lobe of the cerebral cortex includes the primary visual cortexpg105