Question: In your data set is a random list of Data Science salaries from 2020 - 2022. You are given an unknown label X , the

In your data set is a random list of Data Science salaries from 2020 - 2022. You are given an unknown label X, the job title, and their salaries (in ten thousands) - this will be your Y.

Summary Statistics

  • Mean of X and Y (5 points each)
  • Median of X and Y (5 points each)
  • Q1 of X and Y (5 points each)
  • Q3 of X and Y (5 points each)
  • IQR of X and Y (5 points each)
  • Outliers of X and Y (5 points each)

Simple Linear Regression

  • Find the equation for Linear Regression on this dataset - (5 points)
  • Does X make sense? Why? Why not? - (10 points)
  • Please give a possible label for X - (5 points)

Z-Tables

In your work for Linear Regression, you have calculated the mean and the standard deviation of the salaries for Data Scientists. Using that information, please calculate the following:

  • The percentage of data scientists whose income are below 99.5 - 5 points for showing the Z score, and 5 points for showing the percentage
  • The percentage of data scientists whose income are above 137.5 - 5 points for showing the Z score, and 5 points for showing the percentage
X Job Title Salary (in ten thousands)
538 Data Scientist 141
269 Data Engineer 65
543 Data Engineer 99
243 Data Scientist 165
336 Data Analyst 167
61 Data Engineer 131
370 Data Scientist 123
91 Data Science Consultant 77
248 Data Engineer 96
39 Machine Learning Engineer 138
76 BI Data Analyst 100
339 Data Analyst 109
111 Director of Data Engineering 113

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!