Question: install.packages(AppliedPredictiveModeling) install.packages(caret) install.packages(rpart) install.packages(rpart.plot) install.packages(Metrics) library(rpart) library(rpart.plot) library(caret) library(AppliedPredictiveModeling) library(Metrics) data(abalone) set.seed(10) folds Tree-based prediction Name of the dataset: 'abalone'. Package: AppliedPredictiveModeling Source: Data comes

install.packages("AppliedPredictiveModeling") install.packages("caret") install.packages("rpart") install.packages("rpart.plot") install.packages("Metrics")
library(rpart) library(rpart.plot) library(caret) library(AppliedPredictiveModeling) library(Metrics) data(abalone) set.seed(10) folds
Tree-based prediction Name of the dataset: 'abalone'. Package: \"AppliedPredictiveModeling\" Source: Data comes from an original (non-machine-learning) study: Warwick J Nash, Tracy L Sellers, Simon R Talbot, Andrew J Cawthorn and Wes B Ford (1994) \"The Population Biology of Abalone (Haliotis species) in Tasmania. |. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait", Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288) Check hittp://archive ics_uci.edu/ml/datasets/Abalone for the details of this dataset. Reerssion problem Goal: Predict the age of abalone from physical measurements. The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope a boring and time-consuming task. Other measurements, which are easier to obtain, are used to predict the age. Further information, such as weather patterns and location (hence food availability) may be required to solve the problem. Questions 1. Read the data from the package, check the names of the variables and the dimension of the dataset. (1pt) 2. Split the dataset into two parts, where the training part contains 75% of the data and the test part contains the rest 25%. Fit a regression tree to the data with Rings as the response variable and all the other veriables as predictors. (1) Plot the tree and show which variables are used to contruct the tree. (2) Apply the tree to the test set, calculate the root square of the mean squared error (RMSE). (4pt) 3. Check the plot between the cross-valiation error and the size of the tree, select the \"best" number of terminal nodes (based on your own judgement). Prune the tree, plot the pruned tree and re-calculate the RMSE. (4pt) 4. Suppose you are interested in comparing the performance of linear regression and regression tree. Please use the following partitioned 10 folds to calculate the cross-validation error for these two methods (the cross-validation error is defined as the average of the RMSE). Which method gives the smaller RMSE? (4pt)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!