Question: You will use the model created in HW-B, part [d] to answer the questions in [a] and [b] below. (a) [0.6 pts.] Print a summary'

You will use the model created in HW-B, part [d] to answer the questions in [a] and [b] below. (a) [0.6 pts.] Print a summary' of the model and based on that answer the following questions: (i) Which of the parameters are statistically significant at an t: of at feast 0.05? (ii) which of the parameters is the most important parameter for prediction? (iii) Which of the parameters is the least important parameter for prediction? (b) Use three digits of precision for this question (Hint: see the round(] function). [0.6 pts.] Count how many of the fitted values matched the PTS in the test dataset at a 95% confidence level by creating prediction intervals. To be considered a match, the response value of each observation in in the test dataset should be in the prediction interval created by predict. Note that we have the ground truth in our test dataset, so we can count the matches as one measure of accuracy. ICoerce the object returned from the predict(] function to be a data frame. To this data frame, add the response variable {PTS} from the held-out test dataset so that the data set has four columns: fit, lwr, upr, and PTS. Your data frame will look like the following (this is just an example, the contents of the data frame below may be different than yours). fit lwr upr PTS 5 7.843 2.6?2 9.891 3" 15 1E! . 891 1G! . 234 15. 991 9 Print out the data frame. Write R code that will count whether the response (PTS) falls within the lwr and upr limits of the prediction interval. For the output above, the first response [7] falls within the confidence interval, while the second one [9) does not Your code should do this computation and output an answer as follows: Number of predictions that are in the prediction interval: XX (c) [0.6 pts.] Consider the data frame you created in (b). Using the "fit" (prediction) and "PTS" columns, write R code to find out the RSS and RSE of your predictions. Recall that RSS and RSE are defined as shown below: RSS: RSE RSS na p-1 (d) [0.6 pts.] Plot the residuals of your predictions. What can you say about the distribution of the residuals? (e) [0.6 pts.] Plot a histogram of the residuals of your prediction. Does the histogram follow a Gaussian distribution?You are provided the Titanic dataset [titanic-newcsv). This dataset contains 1,309 observations across 15 dimensions. The response variable is "survived\" (D I did not survive, I I survived). The rest of the dimensions are as follows: pclass Passenger class (1 = 1'I class, 2 = 2\"\"1 class, 3 = 3mi class] name Name of passenger sex Gender of passenger (M, F] age Age of passenger sibsp Number of siblings or spouses aboard parch Number of parents children aboard ticket Ticket number fare Passenger fare cabin Cabin number embarked Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton) boat Lifeboat number [if survived] body Body number (if did not survived and body was recovered) home.dest Passenger origin and destination has_cabin_number Whether passenger had a cabin (1 = Yes, I] = No) Divide the dataset into 80% f 20%, 30% for training and 20% for testing. Use the seed[1122] before dividing the dataset. You will be creating a decision tree to predict who survives and who perishes. (a) [0.5 pts.] From the table above, list out the predictors you will use for modeling (b) [0.5 pts.] In the entire dataset, how many people survived and how many perished? (Hint: use table().) (c) [1].?5 pts.] Is the dataset class-balanced [i.e., has an equal class distribution in the dataset]? If the dataset is class- balanced, state so; if it is not, what problems do you foresee when you make predictions? (d) [0.5 pls.] Create a rpart decision tree model to predict \"survived\" using the predictors chosen in (a). (e) [0.5 pts.] What are the top 3 important variables from your model? Plot the decision tree using the following code: 3 rpart.plot{model, extra=ll, fallenJeavesIT, type=4, main=\"Titanic Survival Model") > print(model) Using the plot and the output of \"print[model)\

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Briefly discuss the cost-volume-profit analysis model and how it is used. Use the attached PDF. APA Citation. 300 words. REVISED PAGES 3 Chapter Three 1 Fundamentals of Cost-Volume-Prot Analysis...

2. Activity-Based Questions choose a different analytic equation for your model. Are the values of the coefficients (A, B, etc.) that yield the best agreement different from what you estimated in...

Please provide Rscript from Rstudio. In Homework 4, you will use Machine Learning Models such as LASSO and Regression Tree to predict the number of college applications expected to receive for a...

In Exercises 1 - 16, plot the point given in polar coordinates and then give three different expressions for the point such that (a) r 0 and 0 (c) r > 0 and 2 9. (20, 3) In...

The sweatshop scandal of the 1990s, which revealed that Nike used forced child labor to manufacture its products, greatly damaged the company. Nike has ever since been plagued with the claims of...

Consider the narration below to answer the following question. You may use the Standard Solver to solve the problem. All the LP questions that deal with Stone Age Surfboards are based on this...

[ 5 pts ] Load and preprocess the data using Pandas or Numpy and, if necessary, preprocessing functions from scikit - learn. For this problem you do not need to normalize or standardize the data....

this is finacial mathematics questions please anybody help S SYAL Saw Hey Sed HITESH Marina 072068 pened the initial debt and at a flat S-R(1) SR(1) theme Alcatract) cand the rate of se can be farer...

please highlight your final answers in yellow and clearly label them using the corresponding question number in an Excel file. For a few questions requiring answers in words, such as Which smoothing...

obs Price SqFt Village Levels (stories) Thatched Garage Wall WaterFront Garden Murder 1 358,520 4,016 Midsomer Worthy 2 no no no no no no 2 352,670 4,128 Midsomer Worthy 2 no no no no no no 3 350,950...

Three blocks on a frictionless horizontal surface are in contact with each other, as shown in Fig/ 4-51. A force F is applied to block A (mass mA). (a) Draw a free-body diagram for each block....

In the training institute of DMRC (Delhi Metro Rail Corporation) at the Shastri Park, trainee coach drivers undergo extensive training in operating train. They undergo classroom training and...

Present, describe, and illustrate the use of vertical analyses of common size financial statements

On 1 March 2007 DB Limited issued R560 000 15% debentures at R98. The debentures were to be redeemed at par in four equal annual payments starting 28 February 2010. Required: Journalise the above...