Question: NEED HELP IN (R) # Call the ISLR library and check the head of College (a built-in data frame # with ISLR, use data() to

NEED HELP IN (R)

# Call the ISLR library and check the head of College (a built-in data frame # with ISLR, use data() to check this.) Then reassign College to a dataframe # called df code here

# EDA # Let's explore the data! # Create a scatterplot of Grad.Rate versus Room.Board, colored by the # Private column.

code here

# Create a histogram of full time undergrad students, color by Private. code here

# Create a histogram of Grad.Rate colored by Private. You should see something odd here. code here

# What college had a Graduation Rate of above 100% ? code here

# Change that college's grad rate to 100% code here

# Train Test Split # Split your data into training and testing sets 70/30. Use the caTools # library to do this.

code here

# Decision Tree # Use the rpart library to build a decision tree to predict whether or not a # school is Private. Remember to only build your tree off the training data.

code here

# Use predict() to predict the Private label on the test data. code here

# Check the Head of the predicted values. You should notice that you actually have two columns with the probabilities. code here

# Turn these two columns into one column to match the original Yes/No Label # for a Private column. code here

# Lots of ways to do this joiner <- function(x){ if (x>=0.5){ return('Yes') }else{ return("No") } } tree.preds$Private <- sapply(tree.preds$Yes,joiner) head(tree.preds)

# Now use table() to create a confusion matrix of your tree model. code here

# Use the rpart.plot library and the prp() function to plot out your tree # model.

code here

# Random Forest # Now let's build out a random forest model! # Call the randomForest package library library(randomForest)

# Now use randomForest() to build out a model to predict Private class. # Add importance=TRUE as a parameter in the model. (Use help(randomForest) # to find out what this does. code here

# What was your model's confusion matrix on its own training set? # Use model$confusion. code here

# Grab the feature importance with model$importance. Refer to the reading # for more info on what Gini[1] means.[2] code here

# Predictions # Now use your random forest model to predict on your test set! code here

# It should have performed better than just a single tree, how much better # depends on whether you are emasuring recall, precision, or accuracy as # the most important measure of the model.

#Ref: www.pieriandata.com

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!