Question: Please read the following questions carefully and answer each question. QA1. What is the key idea behind bagging? Can bagging deal both with high variance
Please read the following questions carefully and answer each question.
QA1. What is the key idea behind bagging? Can bagging deal both with high variance (overfitting) and high bias (underfitting)? 10 points
QA2. Why bagging models are computationally more efficient when compared to boosting models with the same number of weak learners? 5 points
QA3. James is thinking of creating an ensemble mode to predict whether a given stock will go up or down in the next week. He has trained several decision tree models but each model is not performing any better than a random model. The models are also very similar to each other. Do you think creating an ensemble model by combining these tree models can boost the performance? Discuss your answer. 5 points
QA4. Consider the following Table that classifies some objects into two classes of edible (+) and non- edible (-), based on some characteristics such as the object color, size and shape. What would be the Information gain for splitting the dataset based on the Size attribute? 15 points
Yellow | Small | Round | |
Yellow | Small | Round | |
Green | Small | Irregular | |
Green | Large | Irregular | |
Yellow | Large | Round | |
Yellow | Small | Round | |
Yellow | Small | Round | |
Yellow | Small | Round | |
Green | Small | Round | |
Yellow | Large | Round | |
Yellow | Large | Round | |
Yellow | Large | Round | |
Yellow | Large | Round | |
Yellow | Large | Round | |
Yellow | Small | Irregular | |
Yellow | Large | Irregular |
QA5. Why is it important that the m parameter (number of attributes available at each split) to be optimally set in random forest models? Discuss the implications of setting this parameter too small or too large. 5 - points
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
