Given the weather conditions, we want to predict if a person is going to go for...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Given the weather conditions, we want to predict if a person is going to go for a run or not. The data that we have collected are the following: Sample Features Outcome: Go for Run? Forecast Temperature 1 Sunny Cool Yes 2 Sunny Hot No 3 Overcast Cool Yes + Rain Cool Yes 5 Rain Hot No Overcast Hot Yes a) (15 points) Draw a depth = 2 tree. This means splitting the tree once on one variable and once on the other variable, using the Gini Index, which is defined as follows: Gini = =1Pmk (1-Pmk), where K is the number of classes, and pm represents the proportion of the training observations in the mth region that are from the kth class. Show the calculations at each step. You can round/approximate. b) (5 points) Pre-pruning - please draw the tree with maximum depth 1. c) (5 points) Now using the full tree and the tree of depth 1, evaluate the following test set, please provide the accuracy in the following test set. Please explain your findings. If a leaf node of the tree is not 100% pure, describe how you select in this scenario. (Hint: Even if your trees from above are incorrect, please provide what you would expect to find with regards to this test set and the two decision trees). Sample Features Outcome: Go for Run? Forecast Temperature 7 Sunny Cool Yes 8 Sunny Hot Yes 9 Overcast Cool No Given the weather conditions, we want to predict if a person is going to go for a run or not. The data that we have collected are the following: Sample Features Outcome: Go for Run? Forecast Temperature 1 Sunny Cool Yes 2 Sunny Hot No 3 Overcast Cool Yes + Rain Cool Yes 5 Rain Hot No Overcast Hot Yes a) (15 points) Draw a depth = 2 tree. This means splitting the tree once on one variable and once on the other variable, using the Gini Index, which is defined as follows: Gini = =1Pmk (1-Pmk), where K is the number of classes, and pm represents the proportion of the training observations in the mth region that are from the kth class. Show the calculations at each step. You can round/approximate. b) (5 points) Pre-pruning - please draw the tree with maximum depth 1. c) (5 points) Now using the full tree and the tree of depth 1, evaluate the following test set, please provide the accuracy in the following test set. Please explain your findings. If a leaf node of the tree is not 100% pure, describe how you select in this scenario. (Hint: Even if your trees from above are incorrect, please provide what you would expect to find with regards to this test set and the two decision trees). Sample Features Outcome: Go for Run? Forecast Temperature 7 Sunny Cool Yes 8 Sunny Hot Yes 9 Overcast Cool No
Expert Answer:
Answer rating: 100% (QA)
a To build a decision tree using the Gini Index we need to calculate the Gini Index for each possible split at each node and choose the split that min... View the full answer
Related Book For
Data Analysis and Decision Making
ISBN: 978-0538476126
4th edition
Authors: Christian Albright, Wayne Winston, Christopher Zappe
Posted Date:
Students also viewed these computer engineering questions
-
Planning is one of the most important management functions in any business. A front office managers first step in planning should involve determine the departments goals. Planning also includes...
-
answer the question clearly You are building a flight-control system for which a convincing safety case must be made. Would you assign the tasks of safety requirements engineering, test case...
-
2.) A truck with axle loads of W = 19.6kN and W2 = 78.6kN on a wheel base of d = 4.3m rolls across the beam shown in the figure. Determine the following: 3m 25 m 22 m a.) Draw the influence line for...
-
What factors are involved in the shape of the curve shown in Fig. 16.6?
-
The uniform plate of mass m is released from rest while in the position shown. Determine the initial angular acceleration a of the plate and the magnitude of the force supported by the pin at O. The...
-
Can you present a graphic that presents the payroll disbursement amounts by date for the contact employee who has been terminated but has been paid after termination (i.e., ghost employees)?
-
Jamison Woodworking uses normal costing and allocates manufacturing overhead to jobs based on a budgeted labor-hour rate and actual direct labor-hours. Under-or overallocated overhead, if immaterial,...
-
What makes a monetary policy "unconventional"? Make a chronological analysis of advanced economies' monetary policies since the global financial crisis of 2007-2009, including the Covid-19 pandemic...
-
3. Distributions to Shareholders: Residual Dividend Model Quantitative Problem: Lane Industries is considering three independent projects, each of which requires a $2.3 million investment. The...
-
What are the advantages and disadvantages of following a multidomestic strategy?
-
First-mover advantages are more important than a companys capabilities. Discuss.
-
How does strategic group analysis enable a company to understand its competitors?
-
What is a combination lease?
-
If the city in the previous question decides to keep the property for its own use, what amount of expenditures and revenues should be recognized? a. $0 b. $39,000 c. $50,000 d. None of the above....
-
The ends of the 0.4-m slender bar remain in contact with their respective support surfaces. If end B has a velocity vB = 0.5 m/s in the direction shown, determine the angular velocity of the bar and...
-
It is possible to investigate the thermo chemical properties of hydrocarbons with molecular modeling methods. (a) Use electronic structure software to predict cHo values for the alkanes methane...
-
Suppose that you have sued your employer for damages suffered when you recently slipped and fell on an icy surface that should have been treated by your companys physical plant department....
-
Suppose that you observe a random sample of size n from a normally distributed population. If you are able to reject H0: = 0 in favor of a two-tailed alternative hypothesis at the 10% significance...
-
The file S02_12.xlsx includes data on the 50 top graduate programs in the U.S., according to a recent U.S. News & World Report survey. a. Create a table of correlations between all of the numerical...
-
Classify the following topics as primarily macroeconomic or microeconomic: 1. The impact of a tax increase on aggregate output. 2. The relationship between two competing firms pricing behavior. 3. A...
-
Use the high and low volatility scenarios that we used for the call option to show that put options also are worth more when stock price volatility is higher.
-
In light of this discussion, explain why the put-call parity relationship is valid only for European options on non-dividend-paying stocks. If the stock pays no dividends, what inequality for...
Study smarter with the SolutionInn App