Question: 1. Import the needed packages. 2. Load your dataset. 3. Separate target from predictors. 4. Select the numerical columns for X. 5. Preprocess y. 6.
1. Import the needed packages. 2. Load your dataset. 3. Separate target from predictors. 4. Select the numerical columns for X. 5. Preprocess y. 6. Preprocesses the data by using an imputer to fill in missing values and a Decision Tree model to make predictions. 7. Use cross-validation to select parameters for a machine learning model, we want to decide the best value for max_leaf_nodes parameter: a. Find the average (over 4 cross-validation folds) MAE of a machine learning Model. that uses: i. - the data in X and y to create folds, ii. - SimpleImputer() (with all parameters left as default) to replace missing values, and iii. - Decision Tree classifier O (with 'random_state =0 ') to fit a random forest model. 8. Test different parameter values: a. Now, you will use the code above to evaluate the model performance corresponding to four different values for the max_leaf_nodes: 5,50,500, and 5000. b. Find the best parameter value using MIN
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
