Question: The question is about Generalised linear model, and we are required to use R to analyze the dataset and answer some of the questions stated
The question is about Generalised linear model, and we are required to use R to analyze the dataset and answer some of the questions stated below. The name of the dataset is called "Clotting time" and I have pasted the whole thing down below, "u" stands for the concentration of plasma, "lot" is a factor with 2 levels, and "time" indicates the clotting time, which is also the response variable. So I need help for answering those 3 questions below by the combination of words explanation and some plots generated by some R code.
u lot time 1 5 1 118 2 10 1 58 3 15 1 42 4 20 1 35 5 30 1 27 6 40 1 25 7 60 1 21 8 80 1 19 9 100 1 18 10 5 2 69 11 10 2 35 12 15 2 26 13 20 2 21 14 30 2 18 15 40 2 16 16 60 2 13 17 80 2 12 18 100 2 12
(The Question begins here)
A study was conduted on the clotting time (in seconds) of blood, for plasma diluted to 9
different percentage concentrations. Clotting was induced by two types of thromboplastin.
The data in the file ClottingTimes on Wattle in the Course resources section consists
of the following variables: in column 1 is the observation number; in column 2 is the
concentration of plasma, u, treated as a numerical variable; column 3 gives the clotting
agent (type 1 and type 2), in lot, which should be treated as a factor with 2 levels; and
the fourth column gives the clotting time, to be treated as the response variable.
The main goal of the study is to see how clotting time depends on plasma concentration
and lot. However, there is a complication. Your line manager, who is not a specialist
statistician, knows something about linear models but did not learn about Generalised Linear
Models. He/she wants you to use a linear model. You feel that you should keep an
open mind at this stage about what type of modelling to deal with and have convinced your line
manager that it makes good sense for you to investigate the data using three types of model:
(i) normal errors with identity link function;
(ii) gamma errors with identity link function;
(iii) gamma errors with log link.
You can fit models (ii) and (iii) with the commands
family=gamma(link="identity") or family=gamma(link="log").
(a) Using error-link combination (iii), explore different models. Perform suitable checks
on the model that you think is the best choice. [14 marks]
(b) Perform similar analyse to what you did in (a), including suitable final model checks,
but now using the error-link combinations (i) and (ii). Comment on the extent to
which your findings are similar/different to your results in part (a).
[14 marks]
(c) Your line manager would like you to present a fair-minded comparison (in writing) of
the three distinct approaches based on the three error-link combinations. Do
any of the three approaches stand out as being best in this example? If so, how? If
not, what is the evidence? Whatever your answers you should provide evidence from
your analyses.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
