Question: (25 points) Problem 2 A method to investigate the sensitivity of the sample mean and sample median to extreme outliers and changes in the dataset
(25 points) Problem 2
A method to investigate the sensitivity of the sample mean and sample median to extreme outliers and changes in the dataset is to replace one or more elements in a given dataset by a number y and investigate the effect when y changes. To illustrate this, consider the following dataset:
4.35.25.03.84.15.51.94.35.25.03.84.15.51.9
Part A: Compute the sample mean and sample median. Do not use the canned mean and median python functions. Write your own code to compute these quantities. You may use the python length and sort functions, but that is it.
In [2]:
# Your code here.
Part B: Now, recompute the mean and the median using the python numpy functions. Compare your answers to what you computed in Part A. Do your answer match? (Hint: They should!)
In [3]:
# Your code here.
Type Markdown and LaTeX: 22
Part C: Now consider the following data set.
4.35.25.03.84.15.51.94.35.25.03.84.15.51.9y
Is there a value for y that would make the mean of the data equal to 7? If so, calculate the value of y that makes the mean equal to 7. If not, clearly explain why not.
Is there a value for y that would make the median of the data equal to 7? If so, calculate the values of y that makes the median equal to 7. If not, clearly explain why not.
Type Markdown and LaTeX: 22
In [4]:
# Your code here.
Type Markdown and LaTeX: 22
In [5]:
# Your code here.
Part D: Compute the sample variance and the sample standard deviation for the original dataset given in part A using the formulas given in class. You may not use the built-in python variance, standard deviation, or sum functions. Using the length and square root functions is fine.
In [6]:
# Your code here.
Part E: Execute the following code. Does it match what you computed in part D? Why or why not? If not, how can you correct the code below?
In [7]:
dataset = [4.3, 5.2, 5.0, 3.8, 4.1, 5.5, 1.9]
print("The sample variance is: ", np.var(dataset))print("the std dev is: ", np.std(dataset))The sample variance is: 1.2538775510204085 the std dev is: 1.1197667395580244
Type Markdown and LaTeX: 22
In [8]:
# Your code here.
Part F: Again consider the data set from Part C:
4.35.25.03.84.15.51.94.35.25.03.84.15.51.9y
Compute the sample median for the following cases (you may use whatever built-in python functions you'd like):
- =5y=5
- =50y=50
- =4.36y=4.36
- y
- y
In [9]:
# Your code here.
Part G: Think about the previous parts, above, and describe in words or mathematical notation the answer to the following question:
- By varying y, what is the set of all the possible values that the sample mean could take on?
Type Markdown and LaTeX: 22
Part H: Describe in words or mathematical notation, what happens to the sample standard deviation when y is varied in the following ways:
- y
- yx
Type Markdown and LaTeX: 2

(25 points) Problem 2 A method to investigate the sensitivity of the sample mean and sample median to extreme outliers and changes in the dataset is to replace one or more elements in a given dataset by a number y and investigate the effect when y changes. To illustrate this, consider the following dataset: 4.3 5.2 5.0 3.8 4.1 5.5 1.9 Part A: Compute the sample mean and sample median. Do not use the canned mean and median python functions. Write your own code to compute these quantities. You may use the python length and sort functions, but that is it. # Your code here. Part B: Now, recompute the mean and the median using the python numpy functions. Compare your answers to what you computed in Part A. Do your answer match? (Hint: They should!) # Your code here. Type Markdown and Latex: a? Part C: Now consider the following data set. 4.3 5.2 5.0 3.8 4.1 5.5 1.9 y Is there a value for y that would make the mean of the data equal to 7? If so, calculate the value of y that makes the mean equal to 7. If not, clearly explain why not. Is there a value for y that would make the median of the data equal to 7? If so, calculate the values of y that makes the median equal to 7. If not, clearly explain why not Type Markdown and LaTeX: a? # Your code here. Type Markdown and Latex: a? # Your code here. Part D: Compute the sample variance and the sample standard deviation for the original dataset given in part A using the formulas given in class. You may not use the built-in python variance, standard deviation, or sum functions. Using the length and square root functions is fine. #lour code here. Part E: Execute the following code. Does it match what you computed in part D? Why or why not? If not, how can you correct the code below? dataset = [4.3, 5.2, 5.0, 3.8, 4.1, 5.5, 1.9] print("The sample variance is: ", np.var (dataset)) print("the std dev is: ", np. std (dataset)) The sample variance is: 1. 2538775510204085 the std dev is: 1.1197667395580244 Type Markdown and LaTex: a? # Your code here. Part F. Again consider the data set from Part C: 4.3 5.2 5.0 3.8 4.1 5.5 19 y Compute the sample median for the following cases (you may use whatever built-in python functions you'd like): y = 5 y = 50 y = 4.36 . y 00 y -00 # Your code here. Part G: Think about the previous parts, above, and describe in words or mathematical notation the answer to the following question: By varying y, what is the set of all the possible values that the sample mean could take on? Type Markdown and LaTex: a Part 1: Describe in words or mathematical notation, what happens to the sample standard deviation when y is varied in the following ways: y 00 . y - x Type Markdown and LaTeX: a? Back to ton
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
