Question: This question is number 35E in chapter 3 from Sharon L. Lohr'sSAMPLING:DESIGN AND ANALYSIS, text book. The datafile baseball.csv contains statistics on 797 baseball players
This question is number 35E in chapter 3 from Sharon L. Lohr'sSAMPLING:DESIGN AND ANALYSIS, text book.
The datafile baseball.csv contains statistics on 797 baseball players from the rosters of all major league teams in November, 2004. In this exercise, treat the file baseball.csv as the population and draw samples from it.
Here is some SAS code that will read in the data with the variable names.
filename baseball 'H:\baseball.csv'; data baseball; infile baseball delimiter=','; input team $ leagueID $ player $ salary POS $ G GS InnOuts PO A E DP PB GB AB R H SecB ThiB HR RBI SB CS BB SO IBB HBP SH SF GIDP; logsal = log(salary); pitcher = (POS='P'); run;
a. Take a stratified random sample of 150 players from the file, using proportional allocation with the different teams as strata. Describe how you selected the sample.
b. Calculate logsal = ln(salary). Find the mean of the variable logsal, using your stratified sample, and give a 95% CI.
c. Estimate the proportion of players in the data set who are pitchers, and give a 95% CI.
d. Examine the sample variances in each stratum. Do you think optimal allocation would be worthwhile for this problem?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
