Question: In this assignment, use the HBAT_200 data file and build a regression to predict Recommend (willingness to recommend HBAT) via ratings of HBAT's TechSup, Advertising,

In this assignment, use the HBAT_200 data file and build a regression to predict Recommend (willingness to recommend HBAT) via ratings of HBAT's TechSup, Advertising, Sales, Warranty, Billing, Delivery and dummy coded variables representing their Industry and Distribution using the mixed stepwise and best subsets approach.

HBAT_200.xls data file (copy and pasteable to .R format):

https://docs.google.com/spreadsheets/d/1E0JuWnv7dsU2HxGCdFnp4LpoVVjVO0fM/edit?usp=sharing&ouid=107429429411018927337&rtpof=true&sd=true

Modify the following code (copy the text below into a new script file in R Studio, then edit it) to replace the variables with the variables mentioned in the last paragraph.

#Multiple Regression

#Install packages if needed

install.packages("ppcor")

install.packages("caret")

install.packages("lm.beta")

install.packages("leaps")

install.packages("dplyr")

install.packages("car")

install.packages("carData")

install.packages("fastDummies")

install.packages("ggplot2")

install.packages("ggplotgui")

install.packages("MVN")

#load packages if needed

library(ppcor)

library(caret)

library(lm.beta)

library(leaps)

library(dplyr)

library(car)

library(carData)

library(fastDummies)

library(ggplot2)

library(ggplotgui)

library(MVN)

#Import data----

HBAT_200 <- data.frame(readRDS("Module 5/HBAT_200.RDS"))

HBAT_200 <- dummy_cols(HBAT_200)

names(HBAT_200)

#subset data for graphing

HBAT_200b <- subset.data.frame(

HBAT_200, select=c("Satisfaction", "Etail","CompResolve","Products","Pricing","NewProd","PriceFlex","Customer_btwn_1_5yr","Customer_Over5yr", "Size_Large"))

#ggplotgui for easy histograms and scatterplots

ggplot_shiny(HBAT_200b) #histograms of the DV and IVs and scatterplots of DV against each IV

#Check for multivariate outliers

mvnObj <- mvn(

data= HBAT_200b[ ,2:7], #THESE NUMBERS ARE THE COLUMNS FOR THE IVs IN HBAT_200b, ADJUST AS NEEDED

mvnTest="royston",

univariateTest = "Lillie",

multivariatePlot = "qq",

multivariateOutlierMethod = "adj",

showOutliers = TRUE,

showNewData = TRUE)

#from the graph produced by the above, are there outliers? If so, the next line will tell you which obs they are

mvnObj[["multivariateOutliers"]][["Observation"]]

#put the row numbers between the parentheses in the following to delete the outliers and save to a new data frame--no quotes

HBAT_200c <- HBAT_200b[-c(),]

#MULTIPLE REGRESSION ALL IVs

regAll <- lm(Satisfaction ~ Etail + CompResolve + Products + Pricing + NewProd+ PriceFlex + Customer_btwn_1_5yr + Customer_Over5yr +Size_Large, HBAT_200c)

summary(regAll)

vif(regAll)

plot(regAll)

#MIXED STEP-WISE REGRESSION

#define intercept-only model

intercept_only <- lm(Satisfaction ~ 1, data = HBAT_200c)

#define model with all predictors

all <- lm(Satisfaction ~ Etail + CompResolve + Products + Pricing + NewProd+ PriceFlex + Customer_btwn_1_5yr + Customer_Over5yr + Size_Large, data=HBAT_200c)

#perform mixed stepwise regression

both <- step(intercept_only, direction='both', scope=formula(all), trace=0)

#view results of mixed stepwise regression

both$anova

#view final model

both$coefficients

#BEST SUBSETS REGRESSION

Best_HBATreg <- regsubsets(Satisfaction ~ Etail + CompResolve + Products + Pricing + NewProd+ PriceFlex + Customer_btwn_1_5yr + Customer_Over5yr + Size_Large,

data =HBAT_200c,

nbest = 1, # 1 best model for each number of predictors

nvmax = NULL, # NULL for no limit on number of variables

force.in = NULL, force.out = NULL,

method = "exhaustive")

summary(Best_HBATreg)

summary_best_subset <- summary(Best_HBATreg)

as.data.frame(summary_best_subset$outmat)

summary_best_subset$adjr2

summary_best_subset$cp

which.max(summary_best_subset$adjr2)

which.min(summary_best_subset$cp)

# choose the model balancing the desire for the largest adj r-sq, the lowest CP, and the fewest IVs (parsimony)

summary_best_subset$which[7,]

finalHBATreg <- lm(Satisfaction ~ Etail + Products + Pricing + NewProd + Customer_btwn_1_5yr+ Customer_Over5yr + Size_Large, HBAT_200c)

summary(finalHBATreg)

lm.beta(finalHBATreg)

vif(finalHBATreg)

Question 1

Looking at the scatterplots of the DV against each IV, do you have any concerns about violations of the assumptions of regression? If so, what, and what would you recommend to address them?

_____________________________________________________________________________________________________________________

Question 2

Examining the multivariate distribution of the IVs, how many outliers are there? Delete them before running your models (instructions in the code)

_____________________________________________________________________________________________________________________

Question 3

Run the model with all the IVs entered. Is the model significant overall? What does that mean?

How much of the variability in Recommend is explained by the model? Which IVs are significant

at the 95% level? Do the VIFs indicate issues with multicollinearity? Which independent variable

has the most impact on Recommend? Any concerns looking at the graphs of residuals vs

predicted values? Any large residuals or influential observations to be concerned with?

_____________________________________________________________________________________________________________________

Question 4

What is the final model using the mixed stepwise approach? How much of the variability in

Recommend is explained by the model? Which IVs are kept in the model?

Question 5

What is the final model using the best subset approach? How much of the variability in

Recommend is explained by the model? Which IVs are kept in the model?

_____________________________________________________________________________________________________________________

Question 6

How do the 2 models compare to each other, the mixed stepwise and the best subsets? If they

differ, do you prefer one over the other, and why?

_____________________________________________________________________________________________________________________

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Question: What as the average weekly safety inventory level of refined sugar from the beginning January 2022 to the end of July 2022? A. 512,465.9691 metric tons per week B. 316,002.1474 metric tons...

The Phillips curve revisited again. Refer to Example 5.6 and Problem 5.29. It was shown that the percentage change in the index of hourly earnings and the unemployment rate from 1958-1969 followed...

These are .R type files.... to be used in RStudio or I prefer Minitab the lines are 1-54 with empty spaces numbered but blank. The Data is in the next section below it is copied from excel. DO NOT...

Multiple Select Question Select all that apply Identify what might happen if a particular department were eliminated. (Check all that apply.) Multiple select question. Many of the indirect expenses...

Two cars, one a compact with mass 1200 kg and 1he other a large gas-guzzler with mass 3000 kg, collide head-on at typical freeway speeds. (a) Which car has a greater magnitude of momentum change?...

Analysts are projecting that CB Railways will have earnings per share of $3.90. If the average industry PE ratio is about 25, what is the current price of CB Railways?

For each of the following hypothesis tests, determine whether or not the large sample ????-test is appropriate. (a) ????0 ???? = 0.60 and ???? = 100. (b) ????0 ???? 0.40 and ???? = 75. (c) ????0 ...

This chapter discussed many inputs to an organizations sales process. What are the specific data items needed to add a new customer and record a sales order?

Write a program in Assembly for the PIC 1 6 F 8 4 A that generates an output sequence to control a stepper motor. The output sequence to generate is: 0 0 0 1 ; wait for a certain amount of time 0 0 1...

Consolidation related simulation example: Millennium Capital Management, Inc., (MCM) acquired a 90% interest in NextGen, Inc. MCM's Financial Manager, Matthew Steven, has prepared a draft memo to the...

Loucks Company established a $330 petty cash fund on October 2, 2018. The fund is replenished at the end of each month. At the end of October 2018, the fund contained $89 in cash and the following...

Find five interesting facts in Tables 1.4 and 1.5. DATA FROM TABLE 1.5 TABLE 1.4 Broad Categories of Exports of Selected Countries, 2010 SITC Code Product United United States Germany Japan China...

Use Table 1.1 to find three countries that have gone from being mostly closed to being open from 1980 to 2009. Also, find three countries where the reverse has happened. What has been the implication...

Use the data in Table 6.7 to compare U.S. protectionist policies with those of Japan. In what sectors are protection levels relatively equal? Where do they differ? Try to explain these patterns....

Customer demand for some finished good is forecast at 1,000 units for January, 2,000 units for February, and 3,000 units for March. If the company that produces this finished good has nothing in...

Consider the random phasor sum of Section 2.9 with the single change that the phases $\phi_{k}$ are uniformly distributed on $(-\pi / 2, \pi / 2)$. Find the following quantities: \(\bar{r},...

Complete the assignment using appropriate format of INCOME STATEMENT (costs by nature or costs by function). Use Excel, Word or PDF. Another formats will not be accepted. Each transaction is worth...

Rewrite Programming Exercise 7.5 using streams. Display the numbers in increasing order. Data from Programming Exercise 7.5 Write a program that reads in 10 numbers and displays the number of...

Energy investment. An investment bank is thinking of investing in a start-up alternative energy company. They can become a major investor for $6M, a moderate investor for $3M, or a small investor for...

Advertising strategies. After a series of extensive meetings, several of the key decisionmakers for a small marketing firm have produced the following payoff table (expected profit per customer) for...

Flight decision tree. Construct a decision tree for the payoff table in Exercise 15.