Question: Question 3a, 4d, 5a, and 5b are wrong. Please fix them. Thanks # CG Q0a # Read the data file bikeshare.csv into R and name
Question 3a, 4d, 5a, and 5b are wrong. Please fix them. Thanks
# CG Q0a # Read the data file bikeshare.csv into R and name the object bikes.
####### As usual, don't forget the strings = T argument.
bikes <- read.csv("bikeshare.csv", strings = T)
# CG Q1 # Run str() on the bikes data frame to see that
###### there are quantitative variables and to confirm that the
###### strings = T argument reads in character data as a factor.
str(bikes)
# Question 2 # a simple logistic regression
# CG Q2a # Using glm(), build a logistic regression model that has
####### high count (greater than 7,000) as the response and
####### the temperature variable as predictor.
####### Name your fitted model logisticFit.
####### Hint: To model the binary response, use cnt>7000 as the response.
logisticFit <- glm(cnt > 7000 ~ temp, family = "binomial", data = bikes)
# CG Q2b # Use the summary() function on logisticFit
###### to access the results of the regression.
summary(logisticFit)
# CG Q2c # Print the coefficient for temperature. Use coef(logisticFit)
####### followed by the name of that coefficient in quotes
####### inside square brackets as you did in your linear regression HW.
coef(logisticFit)["temp"]
# CG Q2d # Now print the multiplicative effect of temperature
####### on the odds of ride count being higher than 7,000 by
####### wrapping the line of code from Q2c in the exp() function.
exp(coef(logisticFit)["temp"])
# CG Q2e # Using the result from Q2d, determine how the odds of the
####### ride count being higher than 7,000 change with a 1 degree
####### increase in temperature.
####### A: up 1 degree, B: up 13 degrees, C: up by 1%, D: up 13%, E: down 12%
####### Use paste("letter") to indicate your answer. For example,
####### if you think A is the correct answer, type paste("A").
("D")
# CG Q2f # Find the R-squared for the regression.
####### In your calculation, use logisticFit$deviance and
####### logisticFit$null.deviance and not the numbers printed
####### in the summary output for these.
1 - (logisticFit$deviance / logisticFit$null.deviance)
# Question 3 # Predict probability of success
# CG Q3a # You will predict the probability of more than 7,000 rides
####### when it's 25 degrees celsius day.
####### Create a data frame defining this value and name it newdata.
newdata <- data.frame(temp = 25, hum = mean(bikes$hum), windspeed = mean(bikes$windspeed))
# CG Q3b # Use the predict() function to predict the probability of
####### demand for more than 7,000 rides using newdata from Q3a.
predict(logisticFit, newdata, type = "response")
# Question 4 # a logistic regression with multiple predictors
# CG Q4a # Build a logistic regression model that has high ride count
###### as the response modeled by the weather variables
###### weathersit, temp, hum, and windspeed.
###### Use an additive model (don't model interactions or anything fancy).
####### Name your fitted model logistic2.
logistic2 <- glm(cnt > 7000 ~ weathersit + temp + hum + windspeed, data = bikes, family = "binomial")
# CG Q4b # Use the summary() function on logistic2
###### to access the results of the regression.
summary(logistic2)
# CG Q4c # Print the coefficient for windspeed. Use coef(logistic2)
####### followed by the name of that coefficient in quotes
####### inside square brackets as you did in your linear regression HW.
coef(logistic2)["windspeed"]
# CG Q4d # Now print the multiplicative effect of temperature
####### on the odds of ride count being higher than 7,000 by
####### wrapping the line of code from Q2c in the exp() function.
exp(0.0679)
# CG Q4e # Using the result from Q4d, determine how the odds of the
####### ride count being higher than 7,000 change with a 1 mph
####### increase in windspeed (holding other predictors constant).
####### A: down 0.1 mph, B: up 0.9 mph, C: up by 1%, D: up 90%, E: down 10%
####### Use paste("letter") to indicate your answer. For example,
####### if you think A is the correct answer, type paste("A").
("E")
# CG Q4f # Find the R-squared for the regression.
####### In your calculation, use logistic2$deviance and
####### logistic2$null.deviance and not the numbers printed
####### in the summary output for these.
1 - (logistic2$deviance / logistic2$null.deviance)
# Question 5 # Predict probability of success
# CG Q5a # You will predict the probability of more than 7,000 rides
####### on a clear, 25 degrees celsius day (77 degrees farenheit)
####### with 50% humidity and windspeed 5.
####### Create a data frame defining these values and name it newdata2.
newdata2 <- data.frame(weathersit = "clear", temp = 25, hum = 0.5, windspeed = 5)
# CG Q5b # Use the predict() function to predict the probability of
####### demand for more than 7,000 rides using newdata2 from Q5a.
predict(logistic2, newdata = newdata2, type = "response")
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
