Question: Question 3a, 4d, 5a, and 5b are wrong. Please fix them. Thanks # CG Q0a # Read the data file bikeshare.csv into R and name

Question 3a, 4d, 5a, and 5b are wrong. Please fix them. Thanks

# CG Q0a # Read the data file bikeshare.csv into R and name the object bikes.

####### As usual, don't forget the strings = T argument.

bikes <- read.csv("bikeshare.csv", strings = T)

# CG Q1 # Run str() on the bikes data frame to see that

###### there are quantitative variables and to confirm that the

###### strings = T argument reads in character data as a factor.

str(bikes)

# Question 2 # a simple logistic regression

# CG Q2a # Using glm(), build a logistic regression model that has

####### high count (greater than 7,000) as the response and

####### the temperature variable as predictor.

####### Name your fitted model logisticFit.

####### Hint: To model the binary response, use cnt>7000 as the response.

logisticFit <- glm(cnt > 7000 ~ temp, family = "binomial", data = bikes)

# CG Q2b # Use the summary() function on logisticFit

###### to access the results of the regression.

summary(logisticFit)

# CG Q2c # Print the coefficient for temperature. Use coef(logisticFit)

####### followed by the name of that coefficient in quotes

####### inside square brackets as you did in your linear regression HW.

coef(logisticFit)["temp"]

# CG Q2d # Now print the multiplicative effect of temperature

####### on the odds of ride count being higher than 7,000 by

####### wrapping the line of code from Q2c in the exp() function.

exp(coef(logisticFit)["temp"])

# CG Q2e # Using the result from Q2d, determine how the odds of the

####### ride count being higher than 7,000 change with a 1 degree

####### increase in temperature.

####### A: up 1 degree, B: up 13 degrees, C: up by 1%, D: up 13%, E: down 12%

####### Use paste("letter") to indicate your answer. For example,

####### if you think A is the correct answer, type paste("A").

("D")

# CG Q2f # Find the R-squared for the regression.

####### In your calculation, use logisticFit$deviance and

####### logisticFit$null.deviance and not the numbers printed

####### in the summary output for these.

1 - (logisticFit$deviance / logisticFit$null.deviance)

# Question 3 # Predict probability of success

# CG Q3a # You will predict the probability of more than 7,000 rides

####### when it's 25 degrees celsius day.

####### Create a data frame defining this value and name it newdata.

newdata <- data.frame(temp = 25, hum = mean(bikes$hum), windspeed = mean(bikes$windspeed))

# CG Q3b # Use the predict() function to predict the probability of

####### demand for more than 7,000 rides using newdata from Q3a.

predict(logisticFit, newdata, type = "response")

# Question 4 # a logistic regression with multiple predictors

# CG Q4a # Build a logistic regression model that has high ride count

###### as the response modeled by the weather variables

###### weathersit, temp, hum, and windspeed.

###### Use an additive model (don't model interactions or anything fancy).

####### Name your fitted model logistic2.

logistic2 <- glm(cnt > 7000 ~ weathersit + temp + hum + windspeed, data = bikes, family = "binomial")

# CG Q4b # Use the summary() function on logistic2

###### to access the results of the regression.

summary(logistic2)

# CG Q4c # Print the coefficient for windspeed. Use coef(logistic2)

####### followed by the name of that coefficient in quotes

####### inside square brackets as you did in your linear regression HW.

coef(logistic2)["windspeed"]

# CG Q4d # Now print the multiplicative effect of temperature

####### on the odds of ride count being higher than 7,000 by

####### wrapping the line of code from Q2c in the exp() function.

exp(0.0679)

# CG Q4e # Using the result from Q4d, determine how the odds of the

####### ride count being higher than 7,000 change with a 1 mph

####### increase in windspeed (holding other predictors constant).

####### A: down 0.1 mph, B: up 0.9 mph, C: up by 1%, D: up 90%, E: down 10%

####### Use paste("letter") to indicate your answer. For example,

####### if you think A is the correct answer, type paste("A").

("E")

# CG Q4f # Find the R-squared for the regression.

####### In your calculation, use logistic2$deviance and

####### logistic2$null.deviance and not the numbers printed

####### in the summary output for these.

1 - (logistic2$deviance / logistic2$null.deviance)

# Question 5 # Predict probability of success

# CG Q5a # You will predict the probability of more than 7,000 rides

####### on a clear, 25 degrees celsius day (77 degrees farenheit)

####### with 50% humidity and windspeed 5.

####### Create a data frame defining these values and name it newdata2.

newdata2 <- data.frame(weathersit = "clear", temp = 25, hum = 0.5, windspeed = 5)

# CG Q5b # Use the predict() function to predict the probability of

####### demand for more than 7,000 rides using newdata2 from Q5a.

predict(logistic2, newdata = newdata2, type = "response")

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!