Question: Help with regression and validation in R Here is my dataset on google drive https://drive.google.com/open?id=1m3GbC0mj_8t2yGYZ8umg9_dyJYzesqhk a. Check the outliers for Longnose variable using a boxplot.
Help with regression and validation in R
Here is my dataset on google drive https://drive.google.com/open?id=1m3GbC0mj_8t2yGYZ8umg9_dyJYzesqhk
a. Check the outliers for Longnose variable using a boxplot. Remove the outliers from the data set and name it as stream2. Hint: You can find outliers of variable x by boxplot.stats(x)$out Then remove these outliers from the dataset by x[!x%in% outliers] b. Check the normality of the dependent variable by using histogram and density plot (e.g. plot(density(target)) ) c. Create a correlation matrix. Explain the correlation matrix. d. Explore the data set by creating a scatterplot matrix using pairs.panels() function. Explain the scatterplot matrix. Do not split the dataset into training and test sets for questions e, f and g. e. Regress the dependent variable Longnose on all other variables except the variable Stream (these are the names of the streams ) and name it as model_stream. Get the summary results and explain it. f. Check the model assumptions: I. Check normality of errors: You can create a histogram of residuals from a linear model. The distribution of these residuals should be approximately normal. You can also find the residuals of a model by using residuals function. II. Check the linearity between residuals and predicted values. The residuals should be unbiased and homoscedastic.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
