Question: Help with regression and validation in R Here is my dataset on google drive https://drive.google.com/open?id=1m3GbC0mj_8t2yGYZ8umg9_dyJYzesqhk a. Check the outliers for Longnose variable using a boxplot.

Help with regression and validation in R

Here is my dataset on google drive https://drive.google.com/open?id=1m3GbC0mj_8t2yGYZ8umg9_dyJYzesqhk

a. Check the outliers for Longnose variable using a boxplot. Remove the outliers from the data set and name it as stream2. Hint: You can find outliers of variable x by boxplot.stats(x)$out Then remove these outliers from the dataset by x[!x%in% outliers] b. Check the normality of the dependent variable by using histogram and density plot (e.g. plot(density(target)) ) c. Create a correlation matrix. Explain the correlation matrix. d. Explore the data set by creating a scatterplot matrix using pairs.panels() function. Explain the scatterplot matrix. Do not split the dataset into training and test sets for questions e, f and g. e. Regress the dependent variable Longnose on all other variables except the variable Stream (these are the names of the streams ) and name it as model_stream. Get the summary results and explain it. f. Check the model assumptions: I. Check normality of errors: You can create a histogram of residuals from a linear model. The distribution of these residuals should be approximately normal. You can also find the residuals of a model by using residuals function. II. Check the linearity between residuals and predicted values. The residuals should be unbiased and homoscedastic.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!