Question: Using R: This lab concerns data about salaries in San Francisco. The dataset, and corresponding information, can be found at https://www.kaggle.com/kaggle/sf-salaries. 1. Import the data.

Using R:

This lab concerns data about salaries in San Francisco. The dataset, and corresponding information, can be found at https://www.kaggle.com/kaggle/sf-salaries.

1. Import the data. Hints: If you are using read.csv(), you will want to use the option header = TRUE Missing data in this dataset is labeled "Not Provided" or "Not provided" or is blank. You will want to replace these values with NAs. Use the arguments na.string in read.csv() or na in read_csv. 2. Our variable of interest for this lab is going to be Total Pay. Plot a histogram of Total Pay with an overlaying density. (Include y = ..density.. in your aesthetic, so that the histogram heights are densities rather than counts.) Comment briefly on the shape, center, and spread. 3. Suppose were interested in making inference about the typical salary (Total Pay) of all San Francisco city employees and this is our representative sample. Is the mean a good statistic to use here to describe the typical value of salary? Why or why not? 4. Recall that one of the conditions of a one-sample t-test for a mean is that the population is normally distributed. Based on yor graph in (1), does this assumption seem reasonable? Why or why not? 5. Compute a 95% t-confidence interval for the mean Total Pay

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!