Question: THIS IS R PROGRAMMING - Use thetitanic_traindata frame from the titanic library as the starting point for this project. library(titanic) # loads titanic_train data frame

THIS IS R PROGRAMMING - Use thetitanic_traindata frame from thetitaniclibrary as the starting point for this project.

library(titanic) # loads titanic_train data frame

library(caret)

library(tidyverse)

library(rpart)

# 3 significant digits

options(digits = 3)

# clean the data - `titanic_train` is loaded with the titanic package

titanic_clean <- titanic_train %>%

mutate(Survived = factor(Survived),

Embarked = factor(Embarked),

Age = ifelse(is.na(Age), median(Age, na.rm = TRUE), Age), # NA age to median age

FamilySize = SibSp + Parch + 1) %>% # count family members

select(Survived, Sex, Pclass, Age, Fare, SibSp, Parch, FamilySize, Embarked)

Splittitanic_cleaninto test and training sets - after running the setup code, it should have 891 rows and 9 variables.

Set the seed to 42, then use thecaretpackage to create a 20% data partition based on theSurvivedcolumn. Assign the 20% partition totest_setand the remaining 80% partition totrain_set.

How many observations are in the training set? _________

How many observations are in the test set? ___________

What proportion of individuals in the training set survived? ________

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!