Question: Please use R Programming for this question. Data for Question 1: breast_cancer_updated data:https://drive.google.com/file/d/1m-zZI1nGd5qBFd5FymhgzxLLhFshwXBW/view?usp=sharing Question 1: For this problem, you will load the breast_cancer_updated.csv data and

Please use R Programming for this question.

Data for Question 1: breast_cancer_updated data:https://drive.google.com/file/d/1m-zZI1nGd5qBFd5FymhgzxLLhFshwXBW/view?usp=sharing

Please use R Programming for this question. Data for Question 1: breast_cancer_updated

Question 1: For this problem, you will load the breast_cancer_updated.csv data and perform a straightforward training and evaluation of Decision Trees algorithm. a. As a prep rooessing step, remove the lDNumber oolumn and exclude rows with NA in the dataset b. Apply Decision Tree algorithm [use rpart] to the data to predict breast cancer and report the accuracy using 10-fold cross validation. c. WSUHITIE the decision tree. d. Generate the confusion matrix and comment on the confusion matrix. Howr does the accuracy with the confusion matrix he re oompa re with the accuracy in b]. Are they the same or different? e. Generate rules for the decision tree using lF-THEN statements. Question 2: In this problem you will generate decision trees with a set of parameters. You will be using storms [Storm tracking data} data, which is part of dplyr library. This data is a subset of the \"BAA Atlantic hurricane database best track data, htmsffwwwnhc.noaa.govfdataf#hurdat. The data includes the positions and attributes of 198 tropical storms, measured every six hours during the lifetime of a storm. As a preproor-'Issing step, view the data and make sure the target variable which is a string is converted to a factor. a. Build a decision tree using the following hyperpara meters, maxdepth = 2, minsplit = 5 and minbucket = 3. Make sure you don't pretune your tree using the cp parameter. Be careful to use the right method of training so that you are not automatically tuning the op parameters, but you are controlling the aforementioned parameters specically. Use cross validation to report your accuracy soore.'l11ese parameters will result in a relatively small tree. b. To see hovur this performed with respect to classes and if that is different on train versus test, create a partition, train on the training set and create confusion matrix for both train and test partitions. Compare the confusion matrices and report which classes it has problem classifying. Do you think that both are performing similarly and what does that imply about overfitting for the model

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!