Question: Figure 3: A regression tree predicting the fuel efficiency (in kilometers per litre, km/l ) of cars manufactured in 2019. Drive denotes whether the vehicle

Figure 3: A regression tree predicting the fuel efficiency (in kilometers

per litre, km/l ) of cars manufactured in 2019. "Drive" denotes whether

Figure 3: A regression tree predicting the fuel efficiency (in kilometers per litre, km/l ) of cars manufactured in 2019. "Drive" denotes whether the vehicle is all-wheel drive or part-time four-wheel drive; "Eng. Size" is the size of the car's engine in litres; "Fuel Type" is the type of fuel used by the car, and can be either petrol or diesel. Figure 3 shows a decision tree that has been learned from a sample of cars manufactured in 2019 . The target in this case is the fuel efficiency of the car (measured in kilometers per litre of fuel consumed, km/l ) so we are using a regression tree. The predictors that were included in the tree were the type of drive-chain ("Drive", either all-wheel drive or part-time four-wheel drive), the size of the car's engine ("Eng. Size", measured in litres) and the type of fuel used by the vehicle ("Fuel Type", petrol or diesel). Using this tree, please answer the following questions: (i) What is the predicted fuel efficiency of a car which runs on petrol, is all-wheel drive and has an engine size of 6.4l? (2 marks) (ii) What combination of predictors leads to the worst fuel efficiency? (2 marks) (iii) Using this model, what can we say about the effect of engine size on the fuel efficiency of a car? (2 marks) Figure 4: Two possible decision trees for a binary target, each splitting on a different predictor. The leaves show the number of individuals classified in the two different target classes; i.e., 45/48 means 45 individuals in are class zero and 48 are in class one. Imagine we are growing a decision tree to predict a binary target variable. Figure 4 shows two possible decision trees that could be formed by splitting on two different predictors. Which of the two trees would we prefer, and why? (2 marks) When learning a decision tree from data, one of the most important aspects is knowing how many leaves the tree should have, i.e., how complex it should be. If we denote the number of leaves in a tree by L, please describe the standard algorithm that uses K-fold cross validation and pruning to choose an appropriate value of L from the set 1,,Lmax(5 marks ) Figure 3: A regression tree predicting the fuel efficiency (in kilometers per litre, km/l ) of cars manufactured in 2019. "Drive" denotes whether the vehicle is all-wheel drive or part-time four-wheel drive; "Eng. Size" is the size of the car's engine in litres; "Fuel Type" is the type of fuel used by the car, and can be either petrol or diesel. Figure 3 shows a decision tree that has been learned from a sample of cars manufactured in 2019 . The target in this case is the fuel efficiency of the car (measured in kilometers per litre of fuel consumed, km/l ) so we are using a regression tree. The predictors that were included in the tree were the type of drive-chain ("Drive", either all-wheel drive or part-time four-wheel drive), the size of the car's engine ("Eng. Size", measured in litres) and the type of fuel used by the vehicle ("Fuel Type", petrol or diesel). Using this tree, please answer the following questions: (i) What is the predicted fuel efficiency of a car which runs on petrol, is all-wheel drive and has an engine size of 6.4l? (2 marks) (ii) What combination of predictors leads to the worst fuel efficiency? (2 marks) (iii) Using this model, what can we say about the effect of engine size on the fuel efficiency of a car? (2 marks) Figure 4: Two possible decision trees for a binary target, each splitting on a different predictor. The leaves show the number of individuals classified in the two different target classes; i.e., 45/48 means 45 individuals in are class zero and 48 are in class one. Imagine we are growing a decision tree to predict a binary target variable. Figure 4 shows two possible decision trees that could be formed by splitting on two different predictors. Which of the two trees would we prefer, and why? (2 marks) When learning a decision tree from data, one of the most important aspects is knowing how many leaves the tree should have, i.e., how complex it should be. If we denote the number of leaves in a tree by L, please describe the standard algorithm that uses K-fold cross validation and pruning to choose an appropriate value of L from the set 1,,Lmax(5 marks )

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!

Code the function greedy_predicator without using numpy/pandas Please include explanation of the code & the computational complexity To see the description of the function: Scroll down the...

Part B: (4.5 points) 3. A truck stationed at the depot (location 0) is to serve the demand of sales points 1 through 7, depicted in Figure 1, using a single tour. The distances between the depot and...

C++ Project: It needs to be done in c++ as soon as possible. It's urgent. I'll surely give you thumbs up and reviews if u do it correctly according to the requirements. There is something that needs...

NEED HELP WITH THESE PROBLEMS PLEASE. Page of 8 ZOOM 3a) The first step of Christofides heuristic is to find the minimum spanning tree on the network, Use Kruskal's algorithm to identify the MST on...

Part B: (4.5 points) 4. A truck stationed at the depot (location 0) is to serve the demand of sales points 1 through 7, depicted in Figure 1, using a single tour. The distances between the depot and...

Please show all work legibly, and thanks so much for the explanation! :) 5. Another way of representing a binary tree is to use an array The items in the tree are assigned to locations in the array...

OMGT 6 6 1 3 Management Science Exercise # 3 Prediction, Classification and Text Analytics Download the data files for this assignment. The file contains several tabs with the data required for the...

Part a: ( 2 . 5 points ) Logistics is important not only for providers of goods, but also for providers of services. Transportation is often the primary concern for companies that provide on - site...

Please provide the solutions to the problems given below- ## Explain the data from figure 1 ## Explain the differences in (a) and (b) parts in figure 2 ## Try to recreate with R or Octave, as close...

Please provide the solutions for question 4**** and 5**** only as all the above questions were already answered by Chegg expert. ## Try to recreate with R or Octave, as close as possible, the data...

Since Px= 2555 and Pz 1277 in the pipe under combined loading as shown in the figure, the outer diameter is 42 mm and the inner diameter is 35 mm: a- Occurs at points A, B, C and D Calculate and show...

1. If a mail survey were used, what would be the pros, cons, and special considerations associated with achieving the overriding objective of the survey? 2. Many telephone data-collection companies...

code class = "asciimath" > T . Hernandez withdraws from a partnership with four other partners. Hernandez agrees to take $ 4 0 , 0 0 0 cash in settlement of his capital balance of $ 3 0 , 0 0 0 . The...

5. Develop a scenario comparing two PH programs and involving the use of a CBA.

Centrality of work (MOW, 1987). Some African countries may be considered to be high in femininity and low on work centrality. This has implications for the results-focus of many Western management...

Uncertainty avoidance (Hofstede, 1991). African societies are considered high in uncertainty avoidance. Western (mainly American) views of change management which increase participation and...

concern about self-presentation and loss of face, and gaining approval of the collective;