The code for each Part of this should be self contained, that is, each of Part 1, 2, and 3 should contain all the necessary code and not rely on code from another Part of the lab in order to run all parts of the lab should be done using python, sklearn, pandas, numpy, and matplotlib Part 1 Creating and evaluating a random forest model In this part of the lab, you should read in the data verify that all the data is numeric and that there are no missing values split the data into training and validation sets (don't worry about creating a final test set) create a random forest model using the data evaluate the model on both the training and validation sets using MAE and error Part 2 Exploring the n estimators hyper parameter In this part of the lab you should use a for loop to create a random forest model for each value of n estimators from 1 to 30 evaluate each model on both the training and validation sets using MAE visualize the results by creating a plot of n estimators vs MAE for both the training and validation sets After that you should answer the following questions Which value of n estimators gives the best results Explain how you decided that this value for n estimators gave the best results Why is the plot you created above not smooth Was the result here better than the result of Part 1 What better or worse was it Part 3 Exploring the max features hyper parameter In this part of the lab you should use a for loop to create a random forest model for each value of max features from 1 to the total number of features in the data for each model, use the value for n estimators as determined in Part 2 evaluate each model on both the training and validation sets using MAE visualize the results by creating a plot of max features vs MAE for both the training and validation sets After that you should answer the following questions Which value of max features gives the best results Explain how you decided that this value for max features gave the best results Was the result here better than the result of Part 2 What better or worse was it

The Answer is in the image, click to view ...

Question: The code for each Part of this should be self-contained, that is, each of Part 1, 2, and 3 should contain all the necessary code

The code for each Part of this should be self-contained, that is, each of Part 1, 2, and 3 should contain all the necessary code and not rely on code from another Part of the lab in order to run.
all parts of the lab should be done using python, sklearn, pandas, numpy, and matplotlib.

Part 1 - Creating and evaluating a random forest model

In this part of the lab, you should:

read in the data;
verify that all the data is numeric and that there are no missing values;
split the data into training and validation sets (don't worry about creating a final test set);
create a random forest model using the data;
evaluate the model on both the training and validation sets using MAE and % error.

Part 2 - Exploring the n_estimators hyper-parameter

In this part of the lab you should:

use a for loop to create a random forest model for each value of n_estimators from 1 to 30;
evaluate each model on both the training and validation sets using MAE;
visualize the results by creating a plot of n_estimators vs MAE for both the training and validation sets.

After that you should answer the following questions:

Which value of n_estimators gives the best results?
Explain how you decided that this value for n_estimators gave the best results;
Why is the plot you created above not smooth?
Was the result here better than the result of Part 1? What % better or worse was it?

Part 3 - Exploring the max_features hyper-parameter

In this part of the lab you should:

use a for loop to create a random forest model for each value of max_features from 1 to the total number of features in the data;
for each model, use the value for n_estimators as determined in Part 2;
evaluate each model on both the training and validation sets using MAE;
visualize the results by creating a plot of max_features vs MAE for both the training and validation sets.

After that you should answer the following questions:

Which value of max_features gives the best results?
Explain how you decided that this value for max_features gave the best results;
Was the result here better than the result of Part 2? What % better or worse was it?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

This is one problem with multiple parts. I need them all answerd. All in java please. Part 1 Suose yondas.Ostos dolores half of what you had had pears by the window tomehow Wittenho called richocking...

I am looking for some java help for my final project "Collection Manager Program". I had completed all the milestones but I am new to java so don't know how to everything together and also need help...

MATLAB question: I have written a code to solve for a metals resitance. Here is the problem statement: Part 1 (Algorithms) due Friday: Think about how you might solve the problem of calculating the...

SMILEY PNG Assignment Search DuckDuckGo or type a URI Assignment Tasks To help you organize your approach, detailed instructions are provided below in the form of 4 main tasks. rarh separated into...

I need help with the following review questions please! Its related to ethics and accounting, should be straight forward please help. I need this by tomorrow. Please let me know if you need any...

Assembly language RaspberryPi This all the information which was provided and there is nothing else : Read the entire lab instruction before starting. This lab is to be completed on BrightSpace any...

I need full code for this project. All the resources are found here. The code for lab 3 is: -) https://ucsb csB.github.io/w19 matni/lab/project01/ Goal and Background The goal of this project is to...

/* * WEB222 - Assignment 1 * * I declare that this assignment is my own work in accordance with * Seneca Academic Policy. No part of this assignment has been copied * manually or electronically from...

Input: insert 0 DNA AATTCCGGAATTCCGG insert 2 RNA UAGACAUGGAUU insert 1 DNA ABCDE insert 1 RNA TTTT insert 4 DNA AATTCCGGAATTCCGG print remove 1 remove 4 print print 0 print 2 print 4 clip 0 0 7...

Please do both parts of the programming project. It can be done with Netbeans 8.1 or 8.2 in Java. One part is a GUI that looks similar to the following. However, the GUI must be altered in accordance...

Which of the following statements about velocity and/or speedare always TRUE? (1) Velocity is a vector quantity and speed is a scalarquantity. (2) The average velocity of an object on a round-trip...

Find the scalar equation for the plane passing through the point P=(4, 1, 4) and containing the line L defined by x = 5t y = 3+5t z = 25t 0=0

Some care providers enter into an agreement with managed care plans whereby the provider agrees to provide medical services to plan enrollees at a pre - negotiated rate. The providers are considered...

2.1 standard liters of air were sampled through an organic vapor adsorbing medium. The following quantities of compounds were recovered: 20.08 ng benzene (Mw = 78.13) 678.91 ng toluene (Mw = 92.2)...

Go to the website of the Federal Reserve Bank of St. Louis (http://www.stlouisfed.org) to find some information about the Fed. Find a map of the Federal Reserve districts. If you live in the United...

Suppose that the T-account for First National Bank is as follows: Assets Liabilities Reserves $100,000 Deposits $500,000 Loans 400,000 a. If the Fed requires banks to hold 5 percent of deposits as...

Imagine that you intend to buy a portfolio of ten stocks with some of your savings. Should the stocks be of companies in the same industry? Should the stocks be of companies located in the same...