New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
introduction to mixed modelling
Introduction To Mixed Modelling 2nd Edition N. W. Galwey - Solutions
(a) Use analysis of variance methods to obtain the estimated mean number of days to the flowering of vernalized plants of each F3 family.
(b) Create a new variable, with one value for each control plant, giving the corresponding unshrunk family mean for vernalized plants.
(c) Fit a mixed model in which the response variable is the number of days to the flowering of the control plants, and the model terms are– the corresponding family-mean value for vernalized plants– family– plant group within family and– plant within group.Interpret the results.
(d) Obtain estimates of the constant and of the effect of days to flowering in vernalized plants.
(e) Display the results of your analysis graphically, showing the linear trend relating the number of days to flowering in the control plants to that in the vernalized plants and the mean values of these variables for each family.
(f) Obtain estimates of the amount of variation in the number of days to the flowering of the control plants that is accounted for by each term in the mixed model.
g) How can the analysis be interpreted to give an estimate of the effect of omitting the low-temperature stimulus on the subsequent development of plants in each family?
5.1 Return to the data set concerning the yield of F3 wheat families in the presence of ryegrass, introduced in Exercise 3.3.
(a) Using mixed modelling, obtain an estimate of the mean yield of each of the F3 families, specifying ‘family’ as a fixed-effect term.
(b) Using mixed modelling and specifying ‘family’ as a random-effect term, obtain the following:(i) an estimate of the overall mean of the population of F3 families(ii) the BLUP for the effect of each family.
(c) From the output of the analysis performed in Section (b), obtain the following:(i) an estimate of the variance component for ‘family’(ii) an estimate of the residual variance component.Obtain also the number of observations of each family.
(d) From the information obtained in Parts (a)–(c), compare the relationship between the BLUPs and the estimates of family means obtained specifying ‘family’ as a fixed-effect term with that given in Equation 5.6.
(e) Obtain the shrunk mean for each family, and plot the shrunk means against the means obtained when ‘family’ is specified as a fixed-effect term (the unadjusted means).The point representing one of the families deviates from the general relationship between these two types of mean.
(f) What is the distinguishing feature of this family?
5.2 In many types of plant, exposure to low temperature at an early stage of development causes flowering to occur more rapidly: this phenomenon is called vernalization. An inbred line of chickpea with a strong vernalization response and a line with little or no vernalization response were crossed,
(a) Divide the data between two spreadsheets, one holding only the results from the vernalized plants, the other, only those from the control plants.
(b) Analyse the results from the vernalized plants by mixed modelling, specifying ‘family’,‘group’ and ‘plant’ as random-effect terms.
(c) Obtain an estimate of the component of variance for each of the following terms:(i) family(ii) group within family(iii) plant within group.Which term in your mixed model represents residual variation?
(d) Estimate the heritability of time to flowering in vernalized plants from this population of families. (N.B. The estimate obtained using the methods described in Chapter 3 is slightly biased downwards, as some of the residual variance is due to genetic differences among plants of the same
(e) Obtain the unadjusted mean and the shrunk mean, and the BLUP, for the number of days from sowing to flowering in each family.
(f) Extend Equation 5.6 to the present situation, in which two components of variance contribute to the shrinkage of the BLUPs. Use the values obtained above to check your equation.
(g) Repeat the steps indicated in Sections (b), (c) and (d) of this exercise for the control plants. Comment on the difference between the estimates of variance components and heritability obtained from the vernalized plants and the control plants.Table 5.2 Time from sowing to flowering of F3
(h) Plot the shrunk mean for each family obtained from the control plants against the corresponding value obtained from the vernalized plants. Comment on the relationship between the two sets of means.
5.3 Return to the house price data analysed in Chapter 1.
(a) Fit the mixed model introduced in Chapter 1 to these data. Obtain the intercept and slope of the line of best fit relating log(house price) to latitude. Obtain the BLUP for the effect of each town. Do you consider this data set to be adequate to permit interpretation of the BLUPs?
(b) Obtain the fitted value of log(house price) for each town, that is, the value on the line of best fit at the latitude of each town.
(c) Hence obtain the shrunk mean value of log(house price) for each town.
(d) Obtain the BLUE for each town.
(e) Produce a figure similar to Figure 1.4, but with the line of best fit from the mixed model instead of the simple regression line. Add the shrunk means to this plot. Comment on their distribution relative to the simple means.
(f) Plot the shrunk means against the simple means. Identify any crossovers among the towns for these variables.
(g) Plot the shrunk means against the simple means. Again, identify any crossovers.
11.1 Three observations were made of a random variable Y, namely y1 = 51, y2 = 35, y3 = 31.(a) Assume that Y ∼ N(40, ????2), that is, the mean is known but the variance must be estimated. Obtain the maximum-likelihood estimate of ????2 from this sample. Is this estimate unbiased?
(b) Now assume that Y ∼ N(????, ????2), that is, both parameters must be estimated. Obtain the maximum-likelihood estimates of ???? and ????2. Are these estimates unbiased? Obtain the residual-maximum-likelihood(REML) estimate of ????2. Is this estimate unbiased?
(c) Make a sketch of the data space defined by this sample, corresponding to that presented in Figure 11.3 for a sample of two observations. Show the Y1, Y2 and Y3 axes, the ????-effects axis and the observed values.
(d) Describe briefly how this geometrical representation can be used to obtain estimates of ???? and ????2, designated ̂???? and ̂????2.What is the distance from the point y = (y1, y2, y3)′to the point ̂???? = (̂????, ̂????, ̂????)′?When this graphical approach is used to represent a
(e) What is the shape of the corresponding contours for this sample of three observations?
(f) Sketch the contour at a distance√3̂???? from the point (̂????, ̂????, ̂????)′:(i) when the postulated values ̂???? and ̂????2 are the maximum-likelihood estimates and(ii) when the postulated value ̂???? is above the maximum-likelihood estimate and the postulated value ̂????2 is
(g) In what sub-space of the data space must the point y then lie?
(h) Within this random-effects sub-space, what will be the shape of the contours of the probability distribution? Mark the sub-space and a representative contour on your sketch.
11.2 Seven observations on two explanatory variates, X1 and X2, and a response variate, Y, are presented in Table 11.6.Why is the criterion for fitting mixed models called REsidual Maximum Likelihood? 475 Table 11.6 Observations of two explanatory variates and a response variate.X1 X2 Y 42 7.3
(a) Fit the model Y = ????0 + ????1X1 + ????2X2 + E to these data, and obtain estimates of ????0, ????1 and ????2.
(b) When this model is fitted to these data, how many dimensions does each of the following have:(i) the data space?(ii) the fixed-effects sub-space?(iii) the random-effects sub-space?
(c) Obtain the estimated value of Y, and the estimate of the residual effect, for each observation.For Observation 5, the estimated value of Y=128.2.
(d) What is the contribution to this value of(i) the constant effect?(ii) the effect of X1?(iii) the effect of X2?It is assumed that E ∼ N(0, ????2).
(e) Obtain the maximum-likelihood estimate of ????2 and the REML estimate of ????2.
(f) What is the minimum number of observations required to obtain estimates of ????0, ????1,????2 and ????2? If the number of observations available is one less than this minimum, what estimates can be obtained? What is then the relationship between the estimated and observed values of Y?
11.3 Consider the final model fitted to the osteoporosis data in Section 7.2.(a) When this model is fitted to these data, how many dimensions does each of the following have(i) the data space?(ii) the fixed-effects sub-space?(iii) the random-effects sub-space?
(b) What is the relationship between the number of dimensions of the random-effects sub-space and the degrees of freedom of the deviance from this model?
10.2 Return to the data set concerning the relationship between the distance from bushland and the level of predation on seeds, introduced in Exercise 7.4 in Chapter 7. In the earlier analysis of these data, it was assumed that the percentage of predation could be regarded as a normally distributed
(a) Using the information on the number of seeds of each species per cage, convert each value of the percentage of predation (‘%pred’) to the actual number of seeds removed by predation.
(b) Re-fit your mixed model to the data, using the number of seeds removed by predation as the response variable, and specifying an appropriate error distribution for this response variable, and an appropriate link function to relate the response variable to the linear model.
(c) Obtain diagnostic plots of the residuals. Do these plots indicate that the assumptions underlying the analysis are more nearly fulfilled as a result of the changes to the response variable, the error distribution and the link function?
(d) Make a graphical display, showing the fitted relationship between the proportion of seeds removed by predation and the distance from the bushland, taking into account any other model terms (i.e. residue, cage type, species and/or interaction terms) that your analysis indicates are important.
10.3 Return to the data on the efficacy of lithium as a treatment for amyotrophic lateral sclerosis(Section 7.5). Define????2 1 =variance component for term ‘id’????2 2 =variance component for term ‘id.visit_day’ and????12 =correlation coefficient between effects of ‘id’ and
(a) Consider whether any of the variables studied should be modelled as fixed-effect terms when estimating the genetic and residual components of variance of the weekly growth rate.
(b) Fit a mixed model to the data, taking account of the pedigree structure. Obtain estimates of the genetic and residual components and estimate the heritability of the weekly growth rate.
c) Compare graphically the phenotypic value of the weekly growth rate for each individual and the estimated genetic effect on this variable.
10.5 Return to the data on yields of wheat genotypes, investigated in an alphalpha design, presented in Table 9.5.
(a) Explore the possibility of analysing these data using an auto-regressive model, instead of the model based on the alphalpha design fitted in Sections 9.6–9.8.
(b) How well does your auto-regressive model fit the data, relative to the model fitted earlier?
(c) Compare graphically the estimates of the genotype mean yields obtained by the two methods. How much effect will the choice of method have on the decisions made by a breeder seeking genetic improvement?
10.6 Return to the data from a field trial of wheat breeding lines conducted in South Australia, analysed in Sections 10.11–10.13. In those sections, it was noted that the randomized complete block model and the AR1 × AR1 model fitted to these data cannot be compared formally by means of a
(a) Specify an auto-regressive model of which the randomized complete block model is a reduced form.
(b) Fit your newmodel to the data and interpret the results. Conduct a formal significance test to compare it with the randomized complete block model and explain the result of the test.
(c) Make a graphical comparison of the estimated mean yields of the breeding lines obtained from the two models. How much effect will the choice of models have on the decisions made by a plant breeder?
(d) Can your auto-regressive model be compared by a formal significance test with the auto-regressive model used in Sections 10.11 and 10.13? Compare the fit of the two models as well as possible and consider which should be preferred. Make a graphical comparison of the estimated mean yields of the
10.7 Return to the data from a meta-analysis of clinical trials to study the effect of aspirin when given to heart-attack patients, analysed in Exercise 8.2.
(a) Rearrange the data in a ‘stacked’ form, with a separate row for the aspirin-treated and placebo-treated patients in each trial, and a column headed ‘treatment’ that distinguishes the two types of row.
(b) Perform a fixed-effect meta-analysis on the ‘stacked’ data set, using a method that• estimates the effect of treatment as a log odds ratio,• assumes that the number of deaths in each treatment in each trial follows a binomial distribution and• does not depend on an approximation based
(c) Perform a random-effect meta-analysis that meets the same criteria. Compare the results with those of the corresponding analysis based on the normal approximation, and suggest reasons for the discrepancies that you find.
8.1 A multi-centre study to compare two anaesthetic agents (A and B) in patients undergoing short surgical procedures was reported by Whitehead (2002, Section 3.6, pp. 49–55).The response variable considered was the recovery time in minutes, transformed to logarithms.The results are presented in
(a) Analyse these data, specifying both ‘centre’ and ‘treatment’ as fixed-effect terms, with no centre × treatment interaction term. State whether there is significant evidence of a difference between the effects of the two treatments, and if so, which one appears to be more effective in
(b) Extend your model so as to test whether there is evidence that the difference between the effects of the two treatments varies among centres.
(c) Change your model so as to specify ‘centre’ as a random-effect term. Change other aspects of the model accordingly. Comment on the results from this model, and the ways in which they differ from those of the model fitted in Part (a).It is desired to partition the effect of treatment into
(d) Confirm that the variable ‘b’ holds the mean value of ‘treatment’ in each centre, when the treatments are coded as 0 and 1. Confirm that for each patient w = treatment − b.Table 8.6 Recovery time (minutes, log-transformed) after anaesthesia in a multi-centre study to compare two
(e) Fit a model in which both ‘centre’ and ‘treatment’ are specified as fixed-effect terms but the centre × treatment interaction is specified as a random-effect term. Confirm that this gives equivalent results to the partitioning of the treatment effect in Part (d).
(f) Re-test the significance of the centre × treatment interaction term using the deviance accounted for by this term. Compare the result of this test with that performed in Part(b).
(g) Obtain the shrunk mean log(recovery time) for each treatment in each centre produced by the analysis performed in Part (e). Hence obtain a shrunk estimate of the difference between the effects of the two treatments in each centre. Make a graphical comparison between these shrunk differences and
(h) Perform a fixed-effect meta-analysis on the summary data presented in Table 8.6.Present the results of this analysis in a forest plot.
(i) Perform the corresponding random-effect meta-analysis on the summary data. Present the results of this analysis in a forest plot. Comment on the differences between the results of these two analyses and their relationship to the results of the corresponding analyses on the individual-patient
8.2 A meta-analysis of seven randomized clinical trials that studied the effect of aspirin when given to heart-attack patients was reported by Fleiss and Gross (1991). In each trial, the treatments were aspirin and placebo, and the outcome considered was death or survival:in each arm of each trial,
(a) Perform fixed- and random-effect meta-analyses of the effect of aspirin as measured by ‘diff’.
(b) Perform fixed- and random-effect meta-analyses of the effect of aspirin as measured by ‘logOR’. Comment on the results and compare them with those obtained from the analysis of ‘diff’.
(c) Comment on the relative merits of the difference between the proportions and the odds ratio as measures of the size of a treatment effect.Alternative analyses that do not depend on the normal approximation are specified in Exercise 10.7.
8.3 Data have been generated that satisfy the equation yij = ????0i + ????1ixij + ????ij. (8.32)The values in this equation are specified as follows:• the ????0i are values of a random variable B0 that has the distribution B0 ∼ N(2.65, 0.282)• the ????1i are values of a random variable B1
(a) By fitting a regression model to each sample (i.e. to the values of X and Y for a common value of i), obtain estimates of the ????1i, designated b1i and their SEs, designated SEb1i. Obtain the degrees of freedom and the value of MSResid from each regression analysis.
(b) For each sample, obtain the statistic = b1i∕SEb1i. Plot histograms of b1 (i.e. of all the b1i), SEb1 and t, and comment on the distributions of these statistics.
(c) Obtain the two-tailed p-value corresponding to each t statistic. Produce a Q–Q plot of the p-values, transformed to −log(p). Mark the Šidák-corrected significance threshold on this plot and determine the number of samples in which the association between X and Y survives this correction
(d) For each sample, calculate abs(b1i − mean(b1)), where mean(b1) indicates the mean of the bis. Plot these values against SEb1i and obtain the correlation coefficient (r)between these two variables. Obtain the one-sided p-value for the null hypothesis????=0, with the alternative hypothesis
(e) Use Equation 8.17 to obtain an approximate value of var(????2). From the way in which the data were specified, what do you know of the true value of var(????2)? For each sample, use Equation 8.18 to obtain shrunk s2i, and Equation 8.21 to obtain ????i. Obtain median(????) and var(b), and hence
(g) For all samples, plot b1i against ????1i. Obtain the correlation coefficient (r) between these two variables and the p-value associated with r, specified in the same way as in Part (d).
(h) For all samples, plot shrunk b1i against ????1i. Obtain the correlation coefficient between these two variables and the associated p-value, specified as in Part (d). Compare these results with those obtained in Part (g).
(i) Obtain var(b) and var(shrunkb) and compare them with var(????). Comment on the relationships among these variances.
8.4 (a) The file ‘multiple y v x, vary n.xlsx’, available on this book’s website (see Preface), contains data generated in the same way as those used in Exercise 8.3, except that the sample size varies: that is, the number of values of j varies between the values of i.Repeat Exercise 8.3
10.1 The seeds of some species of clover (Trifolium spp.) will not germinate immediately after ripening, but must undergo a period of ‘softening’, by exposure to fluctuating high and low temperatures, usually on the soil surface. In an investigation of this phenomenon, seeds of eight clover
As the species studied all belong to the genus Trifolium they are expected to have some characteristics in common, but no prior information is available to us concerning the seed-softening behaviour of the individual species.
(a) Decide whether ‘species’ should be specified as a fixed-effect or a random-effect term in the model to be fitted to these data and explain your decision.
(b) Specify a regression model for this experiment. Following your decision concerning‘species’, which term(s) in the model should be regarded as fixed and which as random? What is the response variable?
(c) Fit your mixed model to the data, specifying an appropriate error distribution for the response variable, and an appropriate link function to relate the response variable to the linear model.
(d) Consider whether there is evidence that any terms can be omitted from the model. If so, fit the modified model to the data.
Showing 100 - 200
of 219
1
2
3
Step by Step Answers