Question: Question Two - 15 Marks Describe and illstrate a concept from the Bootstrap Section in the notes using one of the following distributions the Beta,
Question Two - 15 Marks
Describe and illstrate a concept from the Bootstrap Section in the notes using one of the following distributions the Beta, Beta prime, Log-Laplace, Weibull, or Kumaraswamy.
There is a 1 to 2 page limit.
Rubric
Criteria
Descriptor
Marks
Data/Model
Description and Reproducibility
/2
Format
Clarity
/3
Demostration
Depth, Relevance, Terminology Used
/5
Concept
Diculty, Correctness and Relevance
/5
Question Three - 8 Marks
In the context of logistic regression describe how to perform the parametric bootstrap.
There is a 1 to 2 pages limit.
Your answer could include an example but it is not a requirement.
Rubric
Criteria
Descriptor
Marks
Format
Clarity, Organization
/2
Writing
Grammar & Punctuation, Clarity
/2
Content
Correctness & Relevant Terminology Used
/4
2
Question Four - 6 Marks
In your words, provide a context and then illstrate each of the three mechanisms that lead to missing data.
See the "Missingness Example" in section "3.1 - Missing Data".
One page limit.
Rubric
Criteria
Descriptor
Marks
Context
Creativity and Relevance
/2
Writing
Clarity
/2
Content
Relevant Terminology Used
/2
Question Five - 20 Marks
In this question you will derive and implement an EM algorithm to fit a multivariate-normal distribution to Ozone (z) and Wind (x) from the air quality dataset. Make sure you define any notation that you introduce.
data(airquality)
head(airquality)
##Ozone Solar.R Wind Temp Month Day
##
1
41
190
7.4
67
5
1
##
2
36
118
8.0
72
5
2
##
3
12
149
12.6
74
5
3
##
4
18
313
11.5
62
5
4
## 5
NA
NA 14.3
56
5
5
## 6
28
NA 14.9
66
5
6
a)[1 Mark] What is the joint distribution of the missing data and the observed data?
b)[2 Marks] State the conditional distribution of the missing data given the observed data.
c)[2 Marks] State the complete data likelihood
d)[4 Marks] E-step: Summarize the derivation of the expected complete data log-likelihood.
e)[4 Marks] M-step: Summarize the derivation of the updates for the parameters.
f)[6 Marks] Implement the above EM in R for the air quality dataset. Use starting values based on the complete cases. Give the MLE and plot the observed log-likelihood evaluated at each iteration.
g)[1 Mark] Plot the imputed dataset.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
