Question: In this assignment, use the HBAT_200 data file and build a regression to predict Recommend (willingness to recommend HBAT) via ratings of HBAT's TechSup, Advertising,

In this assignment, use the HBAT_200 data file and build a regression to predict Recommend (willingness to recommend HBAT) via ratings of HBAT's TechSup, Advertising, Sales, Warranty, Billing, Delivery and dummy coded variables representing their Industry and Distribution using the mixed stepwise and best subsets approach.

HBAT_200.xls data file (copy and pasteable to .R format):

https://docs.google.com/spreadsheets/d/1E0JuWnv7dsU2HxGCdFnp4LpoVVjVO0fM/edit?usp=sharing&ouid=107429429411018927337&rtpof=true&sd=true

Modify the following code (copy the text below into a new script file in R Studio, then edit it) to replace the variables with the variables mentioned in the last paragraph.

#Multiple Regression

#Install packages if needed

install.packages("ppcor")

install.packages("caret")

install.packages("lm.beta")

install.packages("leaps")

install.packages("dplyr")

install.packages("car")

install.packages("carData")

install.packages("fastDummies")

install.packages("ggplot2")

install.packages("ggplotgui")

install.packages("MVN")

#load packages if needed

library(ppcor)

library(caret)

library(lm.beta)

library(leaps)

library(dplyr)

library(car)

library(carData)

library(fastDummies)

library(ggplot2)

library(ggplotgui)

library(MVN)

#

#Import data----

HBAT_200 <- data.frame(readRDS("Module 5/HBAT_200.RDS"))

HBAT_200 <- dummy_cols(HBAT_200)

names(HBAT_200)

#subset data for graphing

HBAT_200b <- subset.data.frame(

HBAT_200, select=c("Satisfaction", "Etail","CompResolve","Products","Pricing","NewProd","PriceFlex","Customer_btwn_1_5yr","Customer_Over5yr", "Size_Large"))

#

#ggplotgui for easy histograms and scatterplots

ggplot_shiny(HBAT_200b) #histograms of the DV and IVs and scatterplots of DV against each IV

#Check for multivariate outliers

mvnObj <- mvn(

data= HBAT_200b[ ,2:7], #THESE NUMBERS ARE THE COLUMNS FOR THE IVs IN HBAT_200b, ADJUST AS NEEDED

mvnTest="royston",

univariateTest = "Lillie",

multivariatePlot = "qq",

multivariateOutlierMethod = "adj",

showOutliers = TRUE,

showNewData = TRUE)

#from the graph produced by the above, are there outliers? If so, the next line will tell you which obs they are

mvnObj[["multivariateOutliers"]][["Observation"]]

#put the row numbers between the parentheses in the following to delete the outliers and save to a new data frame--no quotes

HBAT_200c <- HBAT_200b[-c(),]

#MULTIPLE REGRESSION ALL IVs

regAll <- lm(Satisfaction ~ Etail + CompResolve + Products + Pricing + NewProd+ PriceFlex + Customer_btwn_1_5yr + Customer_Over5yr +Size_Large, HBAT_200c)

summary(regAll)

vif(regAll)

plot(regAll)

#MIXED STEP-WISE REGRESSION

#define intercept-only model

intercept_only <- lm(Satisfaction ~ 1, data = HBAT_200c)

#define model with all predictors

all <- lm(Satisfaction ~ Etail + CompResolve + Products + Pricing + NewProd+ PriceFlex + Customer_btwn_1_5yr + Customer_Over5yr + Size_Large, data=HBAT_200c)

#perform mixed stepwise regression

both <- step(intercept_only, direction='both', scope=formula(all), trace=0)

#view results of mixed stepwise regression

both$anova

#view final model

both$coefficients

#BEST SUBSETS REGRESSION

Best_HBATreg <- regsubsets(Satisfaction ~ Etail + CompResolve + Products + Pricing + NewProd+ PriceFlex + Customer_btwn_1_5yr + Customer_Over5yr + Size_Large,

data =HBAT_200c,

nbest = 1, # 1 best model for each number of predictors

nvmax = NULL, # NULL for no limit on number of variables

force.in = NULL, force.out = NULL,

method = "exhaustive")

summary(Best_HBATreg)

summary_best_subset <- summary(Best_HBATreg)

as.data.frame(summary_best_subset$outmat)

summary_best_subset$adjr2

summary_best_subset$cp

which.max(summary_best_subset$adjr2)

which.min(summary_best_subset$cp)

# choose the model balancing the desire for the largest adj r-sq, the lowest CP, and the fewest IVs (parsimony)

summary_best_subset$which[7,]

finalHBATreg <- lm(Satisfaction ~ Etail + Products + Pricing + NewProd + Customer_btwn_1_5yr+ Customer_Over5yr + Size_Large, HBAT_200c)

summary(finalHBATreg)

lm.beta(finalHBATreg)

vif(finalHBATreg)

Question 1

Looking at the scatterplots of the DV against each IV, do you have any concerns about violations of the assumptions of regression? If so, what, and what would you recommend to address them?

_____________________________________________________________________________________________________________________

Question 2

Examining the multivariate distribution of the IVs, how many outliers are there? Delete them before running your models (instructions in the code)

_____________________________________________________________________________________________________________________

Question 3

Run the model with all the IVs entered. Is the model significant overall? What does that mean?

How much of the variability in Recommend is explained by the model? Which IVs are significant

at the 95% level? Do the VIFs indicate issues with multicollinearity? Which independent variable

has the most impact on Recommend? Any concerns looking at the graphs of residuals vs

predicted values? Any large residuals or influential observations to be concerned with?

_____________________________________________________________________________________________________________________

Question 4

What is the final model using the mixed stepwise approach? How much of the variability in

Recommend is explained by the model? Which IVs are kept in the model?

Question 5

What is the final model using the best subset approach? How much of the variability in

Recommend is explained by the model? Which IVs are kept in the model?

_____________________________________________________________________________________________________________________

Question 6

How do the 2 models compare to each other, the mixed stepwise and the best subsets? If they

differ, do you prefer one over the other, and why?

_____________________________________________________________________________________________________________________

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!