Question: (25 points) Problem 2 A method to investigate the sensitivity of the sample mean and sample median to extreme outliers and changes in the dataset

(25 points) Problem 2

A method to investigate the sensitivity of the sample mean and sample median to extreme outliers and changes in the dataset is to replace one or more elements in a given dataset by a number y and investigate the effect when y changes. To illustrate this, consider the following dataset:

4.35.25.03.84.15.51.94.35.25.03.84.15.51.9

Part A: Compute the sample mean and sample median. Do not use the canned mean and median python functions. Write your own code to compute these quantities. You may use the python length and sort functions, but that is it.

In [2]:

# Your code here.

Part B: Now, recompute the mean and the median using the python numpy functions. Compare your answers to what you computed in Part A. Do your answer match? (Hint: They should!)

In [3]:

# Your code here.

Type Markdown and LaTeX: 22

Part C: Now consider the following data set.

4.35.25.03.84.15.51.94.35.25.03.84.15.51.9y

Is there a value for y that would make the mean of the data equal to 7? If so, calculate the value of y that makes the mean equal to 7. If not, clearly explain why not.

Is there a value for y that would make the median of the data equal to 7? If so, calculate the values of y that makes the median equal to 7. If not, clearly explain why not.

Type Markdown and LaTeX: 22

In [4]:

# Your code here.

Type Markdown and LaTeX: 22

In [5]:

# Your code here.

Part D: Compute the sample variance and the sample standard deviation for the original dataset given in part A using the formulas given in class. You may not use the built-in python variance, standard deviation, or sum functions. Using the length and square root functions is fine.

In [6]:

# Your code here.

Part E: Execute the following code. Does it match what you computed in part D? Why or why not? If not, how can you correct the code below?

In [7]:

dataset = [4.3, 5.2, 5.0, 3.8, 4.1, 5.5, 1.9]
print("The sample variance is: ", np.var(dataset))
print("the std dev is: ", np.std(dataset))
The sample variance is: 1.2538775510204085 the std dev is: 1.1197667395580244 

Type Markdown and LaTeX: 22

In [8]:

# Your code here.

Part F: Again consider the data set from Part C:

4.35.25.03.84.15.51.94.35.25.03.84.15.51.9y

Compute the sample median for the following cases (you may use whatever built-in python functions you'd like):

  • =5y=5
  • =50y=50
  • =4.36y=4.36
  • y
  • y

In [9]:

# Your code here.

Part G: Think about the previous parts, above, and describe in words or mathematical notation the answer to the following question:

  • By varying y, what is the set of all the possible values that the sample mean could take on?

Type Markdown and LaTeX: 22

Part H: Describe in words or mathematical notation, what happens to the sample standard deviation when y is varied in the following ways:

  • y
  • yx

Type Markdown and LaTeX: 2

(25 points) Problem 2 A method to investigate the sensitivity of the

(25 points) Problem 2 A method to investigate the sensitivity of the sample mean and sample median to extreme outliers and changes in the dataset is to replace one or more elements in a given dataset by a number y and investigate the effect when y changes. To illustrate this, consider the following dataset: 4.3 5.2 5.0 3.8 4.1 5.5 1.9 Part A: Compute the sample mean and sample median. Do not use the canned mean and median python functions. Write your own code to compute these quantities. You may use the python length and sort functions, but that is it. # Your code here. Part B: Now, recompute the mean and the median using the python numpy functions. Compare your answers to what you computed in Part A. Do your answer match? (Hint: They should!) # Your code here. Type Markdown and Latex: a? Part C: Now consider the following data set. 4.3 5.2 5.0 3.8 4.1 5.5 1.9 y Is there a value for y that would make the mean of the data equal to 7? If so, calculate the value of y that makes the mean equal to 7. If not, clearly explain why not. Is there a value for y that would make the median of the data equal to 7? If so, calculate the values of y that makes the median equal to 7. If not, clearly explain why not Type Markdown and LaTeX: a? # Your code here. Type Markdown and Latex: a? # Your code here. Part D: Compute the sample variance and the sample standard deviation for the original dataset given in part A using the formulas given in class. You may not use the built-in python variance, standard deviation, or sum functions. Using the length and square root functions is fine. #lour code here. Part E: Execute the following code. Does it match what you computed in part D? Why or why not? If not, how can you correct the code below? dataset = [4.3, 5.2, 5.0, 3.8, 4.1, 5.5, 1.9] print("The sample variance is: ", np.var (dataset)) print("the std dev is: ", np. std (dataset)) The sample variance is: 1. 2538775510204085 the std dev is: 1.1197667395580244 Type Markdown and LaTex: a? # Your code here. Part F. Again consider the data set from Part C: 4.3 5.2 5.0 3.8 4.1 5.5 19 y Compute the sample median for the following cases (you may use whatever built-in python functions you'd like): y = 5 y = 50 y = 4.36 . y 00 y -00 # Your code here. Part G: Think about the previous parts, above, and describe in words or mathematical notation the answer to the following question: By varying y, what is the set of all the possible values that the sample mean could take on? Type Markdown and LaTex: a Part 1: Describe in words or mathematical notation, what happens to the sample standard deviation when y is varied in the following ways: y 00 . y - x Type Markdown and LaTeX: a? Back to ton

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!