Question: Question 1 1/1 point (graded) The following code was used in the video to plot RSS with 0 =25 . beta1 = se q(0, 1,

Question 1

1/1 point (graded)

The following code was used in the video to plot RSS with

=25

 beta1 = seq(0, 1, len=nrow(galton_heights)) results <- data.frame(beta1 = beta1, rss = sapply(beta1, rss, beta0 = 25)) results %>% ggplot(aes(beta1, rss)) + geom_line() + geom_line(aes(beta1, rss), col=2)

In a model for sons' heights vs fathers' heights, what is the least squares estimate (LSE) for

if we assume

is 36?

Hint: modify the code above to do_yr analysis.

0.65

0.5

0.2

correct

Answer

Correct:Correct. You can tell from a plot of RSS vs

that the minimum estimate is 0.5

Submit

You have used 1 of 2 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Show Answer

Question 2

1/1 point (graded)

The least squares estimates for the parameters

,...,

Select an option

maximize

minimize

equal

correct

the residual sum of squares.

Submit

You have used 1 of 1 attempt

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Show Answer

Question 3

1 point possible (graded)

Load theLahmanlibrary and filter theTeamsdata frame to the years 1961-2001. Run a linear model in R predicting the number of runs per game based onboththe number of bases on balls per gameandthe number of home runs per game.

What is the coefficient for bases on balls?

0.39

1.56

1.74

0.027

unanswered

Submit

You have used 0 of 2 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Question 4

1 point possible (graded)

We run a Monte Carlo simulation where we repeatedly take samples of N = 100 from the Galton heights data and compute the regression slope coefficients for each sample:

 B <- 1000 N <- 100 lse <- replicate(B, { sample_n(galton_heights, N, replace = TRUE) %>% lm(son ~ father, data = .) %>% .$coef }) lse <- data.frame(beta_0 = lse[1,], beta_1 = lse[2,])

What does the central limit theorem tell us about the variables beta_0 and beta_1?

Select ALL that apply.

They are approximately normally distributed.

The expected value of each is the true value of

and

(assuming the Galton heights data is a complete population).

The central limit theorem does not apply in this situation.

It allows us to test the hypothesis that

and

unanswered

Submit

You have used 0 of 2 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Question 5

1/1 point (graded)

Which R code(s) below would properly plot the predictions and confidence intervals for our linear model of sons' heights?

NOTE: The function as.tibble() has been replaced by as_tibble() in a recent dplyr update.

Select ALL that apply.

 galton_heights %>% ggplot(aes(father, son)) + geom_point() + geom_smooth() galton_heights %>% ggplot(aes(father, son)) + geom_point() + geom_smooth(method = "lm") model <- lm(son ~ father, data = galton_heights) predictions <- predict(model, interval = c("confidence"), level = 0.95) data <- as.tibble(predictions) %>% bind_cols(father = galton_heights$father) ggplot(data, aes(x = father, y = fit)) + geom_line(color = "blue", size = 1) + geom_ribbon(aes(ymin=lwr, ymax=upr), alpha=0.2) + geom_point(data = galton_heights, aes(x = father, y = son)) model <- lm(son ~ father, data = galton_heights) predictions <- predict(model) data <- as.tibble(predictions) %>% bind_cols(father = galton_heights$father) ggplot(data, aes(x = father, y = fit)) + geom_line(color = "blue", size = 1) + geom_point(data = galton_heights, aes(x = father, y = son))

correct

Answer

Correct:

Correct. This is one way to plot predictions and confidence intervals for a linear model of sons' heights vs. fathers' heights. This is one of two correct answers.

Correct. This code uses thepredictcommand to generate predictions and 95% confidence intervals for the linear model of sons' heights vs. fathers' heights. This is one of two correct answers.

Submit

You have used 1 of 2 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

In Questions 7 and 8, you'll look again at female heights fromGaltonFamilies.

Definefemale_heights, a set of mother and daughter heights sampled fromGaltonFamilies, as follows:

set.seed(1989) #if you are using R 3.5 or earlier set.seed(1989, sample.kind="Rounding") #if you are using R 3.6 or later library(HistData) data("GaltonFamilies") options(digits = 3) # report 3 significant digits female_heights <- GaltonFamilies %>% filter(gender == "female") %>% group_by(family) %>% sample_n(1) %>% ungroup() %>% select(mother, childHeight) %>% rename(daughter = childHeight)

Question 7

0.0/2.0 points (graded)

Fit a linear regression model predicting the mothers' heights using daughters' heights.

What is the slope of the model?

unanswered

What the intercept of the model?

unanswered

Submit

You have used 0 of 10 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Question 8

0.0/2.0 points (graded)

Predict mothers' heights using the model.

What is the predicted height of the first mother in the dataset?

unanswered

What is the actual height of the first mother in the dataset?

unanswered

Submit

You have used 0 of 10 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

We have shown how BB and singles have similar predictive power for scoring runs. Another way to compare the usefulness of these baseball metrics is by assessing how stable they are across the years.Because we have to pick players based on their previous performances, we will prefer metrics that are more stable. In these exercises, we will compare the stability of singles and BBs.

Before we get started, we want to generate two tables: one for 2002 and another for the average of 1999-2001 seasons. We want to define per plate appearance statistics, keeping only players with more than 100 plate appearances. Here is how we create the 2002 table:

library(Lahman) bat_02 <- Batting %>% filter(yearID == 2002) %>% mutate(pa = AB + BB, singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>% filter(pa >= 100) %>% select(playerID, singles, bb)

Question 9

0.0/2.0 points (graded)

Now compute a similar table but with rates computed over 1999-2001. Keep only rows from 1999-2001 where players have 100 or more plate appearances, calculate each player's single rate and BB rate per season, then calculate the average single rate (mean_singles) and average BB rate (mean_bb) per player over those three seasons.

How many players had a single ratemean_singlesof greater than 0.2 per plate appearance over 1999-2001?

unanswered

How many players had a BB ratemean_bbof greater than 0.2 per plate appearance over 1999-2001?

unanswered

Submit

You have used 0 of 10 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Question 10

0.0/2.0 points (graded)

Useinner_join()to combine thebat_02table with the table of 1999-2001 rate averages you created in the previous question.

What is the correlation between 2002 singles rates and 1999-2001 average singles rates?

unanswered

What is the correlation between 2002 BB rates and 1999-2001 average BB rates?

unanswered

Submit

You have used 0 of 10 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Question 11

0.0/1.0 point (graded)

Make scatterplots ofmean_singlesversussinglesandmean_bbversusbb.

Are either of these distributions bivariate normal?

Neither distribution is bivariate normal.

singlesandmean_singlesare bivariate normal, butbbandmean_bbare not.

bbandmean_bbare bivariate normal, butsinglesandmean_singlesare not.

Both distributions are bivariate normal.

unanswered

Submit

You have used 0 of 2 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Save

Save Your Answer

Question 12

0.0/2.0 points (graded)

Fit a linear model to predict 2002singlesgiven 1999-2001mean_singles.

What is the coefficient ofmean_singles, the slope of the fit?

unanswered

Fit a linear model to predict 2002bbgiven 1999-2001mean_bb.

What is the coefficient ofmean_bb, the slope of the fit?

unanswered

Submit

You have used 0 of 10 attempts

Some problems have options such as save, reset, hints, or show answer. These options follow the Submit button.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Location Income ($1,000) Urban 27 Rural 25 Suburban 25 Suburban 26 Rural 30 Urban 29 Rural 33 Urban 30 Suburban 32 Urban 34 Urban 35 Urban 40 Rural 30 Rural 33 Urban 42 Suburban 32 Urban 43 Urban 43...

1 Exercise 3: Lift and Airfoils The first part of this week's assignment is to choose and research a reciprocating engine powered (i.e. propeller type) aircraft. You will further use your selected...

Let A, B be sets. Define: (a) the Cartesian product (A B) (b) the set of relations R between A and B (c) the identity relation A on the set A [3 marks] Suppose S, T are relations between A and B, and...

For the exclusive use of S. Setiawan, 2015. 9-910-036 REV: APRIL 11, 2011 BENJAMIN EDELMAN THOMAS R. EISENMANN Go oogle In nc. Go oogle's mission is to organize the world's inf n nformation and make...

Please help me with this assignment, 100% human! Reference book George, J. M. (2024). Contemporary management (12th ed.). McGraw-Hill Education. keiser library Syahbinah, S., & Suhardianto, N....

post DataXu: Selling Ad Tech On June 20, 2016, DataXu CEO Mike Baker surveyed the beachfront at the Cannes Lions International Festival of Creativity. Each year, Cannes, a seaside town on the French...

PINE TREES, INC. Memo To: Daniel Martinez, Manager of Risk Management Department From: Erica Marcus, Supervisor, Sales Department Date: February 20, 2006 Re: Burger Ranch...

Question 1 1 / 1 point ( graded ) In the video, we use the function not _ inches to identify heights that were incorrectly entered not _ inches

Homework due Jul 13, 2021 00:30 BST Sharecropping (Banerjee, Gertler, Ghatak) This problem will walk you through a principal agent contracting problem similar to that in the Banerjee, Gertler and...

In the second part of this assessment, you'll analyze a set of mother and daughter heights, also from GaltonFamilies. Define female _ heights, a set of mother and daughter heights sampled from...

Oriole Company had the following department information for the month: Total materials costs Equivalent units of production for materials Total conversion costs Equivalent units of production for...

Obtain the Norton equivalent of the circuit depicted in Fig. 10.106 at terminals a-b. 10 H 2 k52 4 cos(20030) A

1321. The force of the motor M on the cable is shown in the graph. Determine the velocity of the 400-kg crate A when t = 2 s. M F(N) 2500- -F=62512 -1 (s) 2 A Prob. 13-21

On 1 March 2007 DB Limited issued R560 000 15% debentures at R98. The debentures were to be redeemed at par in four equal annual payments starting 28 February 2010. Required: Journalise the above...