Question: Question 1 1/1 point (graded) The following code was used in the video to plot RSS with 0 =25 . beta1 = se q(0, 1,
Question 1
1/1 point (graded)
The following code was used in the video to plot RSS with
0
=25
.
beta1 = seq(0, 1, len=nrow(galton_heights)) results <- data.frame(beta1 = beta1, rss = sapply(beta1, rss, beta0 = 25)) results %>% ggplot(aes(beta1, rss)) + geom_line() + geom_line(aes(beta1, rss), col=2)
In a model for sons' heights vs fathers' heights, what is the least squares estimate (LSE) for
1
if we assume
^
0
is 36?
Hint: modify the code above to do_yr analysis.
0.65
0.5
0.2
12
correct
Answer
Correct:Correct. You can tell from a plot of RSS vs
1
that the minimum estimate is 0.5
Submit
You have used 1 of 2 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Show Answer
Question 2
1/1 point (graded)
The least squares estimates for the parameters
0
,
1
,...,
n
Select an option
maximize
minimize
equal
correct
the residual sum of squares.
Submit
You have used 1 of 1 attempt
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Show Answer
Question 3
1 point possible (graded)
Load theLahmanlibrary and filter theTeamsdata frame to the years 1961-2001. Run a linear model in R predicting the number of runs per game based onboththe number of bases on balls per gameandthe number of home runs per game.
What is the coefficient for bases on balls?
0.39
1.56
1.74
0.027
unanswered
Submit
You have used 0 of 2 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Question 4
1 point possible (graded)
We run a Monte Carlo simulation where we repeatedly take samples of N = 100 from the Galton heights data and compute the regression slope coefficients for each sample:
B <- 1000 N <- 100 lse <- replicate(B, { sample_n(galton_heights, N, replace = TRUE) %>% lm(son ~ father, data = .) %>% .$coef }) lse <- data.frame(beta_0 = lse[1,], beta_1 = lse[2,])
What does the central limit theorem tell us about the variables beta_0 and beta_1?
Select ALL that apply.
They are approximately normally distributed.
The expected value of each is the true value of
0
and
1
(assuming the Galton heights data is a complete population).
The central limit theorem does not apply in this situation.
It allows us to test the hypothesis that
0
=0
and
1
=0
.
unanswered
Submit
You have used 0 of 2 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Question 5
1/1 point (graded)
Which R code(s) below would properly plot the predictions and confidence intervals for our linear model of sons' heights?
NOTE: The function as.tibble() has been replaced by as_tibble() in a recent dplyr update.
Select ALL that apply.
galton_heights %>% ggplot(aes(father, son)) + geom_point() + geom_smooth() galton_heights %>% ggplot(aes(father, son)) + geom_point() + geom_smooth(method = "lm") model <- lm(son ~ father, data = galton_heights) predictions <- predict(model, interval = c("confidence"), level = 0.95) data <- as.tibble(predictions) %>% bind_cols(father = galton_heights$father) ggplot(data, aes(x = father, y = fit)) + geom_line(color = "blue", size = 1) + geom_ribbon(aes(ymin=lwr, ymax=upr), alpha=0.2) + geom_point(data = galton_heights, aes(x = father, y = son)) model <- lm(son ~ father, data = galton_heights) predictions <- predict(model) data <- as.tibble(predictions) %>% bind_cols(father = galton_heights$father) ggplot(data, aes(x = father, y = fit)) + geom_line(color = "blue", size = 1) + geom_point(data = galton_heights, aes(x = father, y = son))
correct
Answer
Correct:
Correct. This is one way to plot predictions and confidence intervals for a linear model of sons' heights vs. fathers' heights. This is one of two correct answers.
Correct. This code uses thepredictcommand to generate predictions and 95% confidence intervals for the linear model of sons' heights vs. fathers' heights. This is one of two correct answers.
Submit
You have used 1 of 2 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
In Questions 7 and 8, you'll look again at female heights fromGaltonFamilies.
Definefemale_heights, a set of mother and daughter heights sampled fromGaltonFamilies, as follows:
set.seed(1989) #if you are using R 3.5 or earlier set.seed(1989, sample.kind="Rounding") #if you are using R 3.6 or later library(HistData) data("GaltonFamilies") options(digits = 3) # report 3 significant digits female_heights <- GaltonFamilies %>% filter(gender == "female") %>% group_by(family) %>% sample_n(1) %>% ungroup() %>% select(mother, childHeight) %>% rename(daughter = childHeight)
Question 7
0.0/2.0 points (graded)
Fit a linear regression model predicting the mothers' heights using daughters' heights.
What is the slope of the model?
unanswered
What the intercept of the model?
unanswered
Submit
You have used 0 of 10 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Question 8
0.0/2.0 points (graded)
Predict mothers' heights using the model.
What is the predicted height of the first mother in the dataset?
unanswered
What is the actual height of the first mother in the dataset?
unanswered
Submit
You have used 0 of 10 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
We have shown how BB and singles have similar predictive power for scoring runs. Another way to compare the usefulness of these baseball metrics is by assessing how stable they are across the years.Because we have to pick players based on their previous performances, we will prefer metrics that are more stable. In these exercises, we will compare the stability of singles and BBs.
Before we get started, we want to generate two tables: one for 2002 and another for the average of 1999-2001 seasons. We want to define per plate appearance statistics, keeping only players with more than 100 plate appearances. Here is how we create the 2002 table:
library(Lahman) bat_02 <- Batting %>% filter(yearID == 2002) %>% mutate(pa = AB + BB, singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>% filter(pa >= 100) %>% select(playerID, singles, bb)
Question 9
0.0/2.0 points (graded)
Now compute a similar table but with rates computed over 1999-2001. Keep only rows from 1999-2001 where players have 100 or more plate appearances, calculate each player's single rate and BB rate per season, then calculate the average single rate (mean_singles) and average BB rate (mean_bb) per player over those three seasons.
How many players had a single ratemean_singlesof greater than 0.2 per plate appearance over 1999-2001?
unanswered
How many players had a BB ratemean_bbof greater than 0.2 per plate appearance over 1999-2001?
unanswered
Submit
You have used 0 of 10 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Question 10
0.0/2.0 points (graded)
Useinner_join()to combine thebat_02table with the table of 1999-2001 rate averages you created in the previous question.
What is the correlation between 2002 singles rates and 1999-2001 average singles rates?
unanswered
What is the correlation between 2002 BB rates and 1999-2001 average BB rates?
unanswered
Submit
You have used 0 of 10 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Question 11
0.0/1.0 point (graded)
Make scatterplots ofmean_singlesversussinglesandmean_bbversusbb.
Are either of these distributions bivariate normal?
Neither distribution is bivariate normal.
singlesandmean_singlesare bivariate normal, butbbandmean_bbare not.
bbandmean_bbare bivariate normal, butsinglesandmean_singlesare not.
Both distributions are bivariate normal.
unanswered
Submit
You have used 0 of 2 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Save
Save Your Answer
Question 12
0.0/2.0 points (graded)
Fit a linear model to predict 2002singlesgiven 1999-2001mean_singles.
What is the coefficient ofmean_singles, the slope of the fit?
unanswered
Fit a linear model to predict 2002bbgiven 1999-2001mean_bb.
What is the coefficient ofmean_bb, the slope of the fit?
unanswered
Submit
You have used 0 of 10 attempts
Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
