Questions and Answers of Bayesian Statistics An Introduction

Apply a three-group discrete mixture model to the baseball average data. A two group mixture, with code as below, gives an LPML (for the original data) of -726.5 , with plots of \(p_{\text {new }}\)
Many evaluations of Poisson mixture models consider aggregated data, for example numbers of consumers making 0,1,2, etc. purchases. Brockett et al. (1996) present such data for purchases of 'salty
A request in the UK parliament (http://www.theyworkforyou.com/wrans/?id=2011-03 \(-07 c .44095\).h) related to 2009 mortality rates (per 100000 population) in Wales according to income decile of
Apply the normal-latent beta model to data on short-term changes in depression ratings (Tarpey and Petkova, 2010). Such changes in depression are unlikely to be due to the pharmacological
In Example 3.1, adapt the code to include a posterior predictive \(p\)-tests to assess skewness and kurtosis in the residuals. For example, the \(p\)-test for skewness would compare a skew measure
In Example 3.2 (student attainment), standardize the SATM-score predictor to provide values \(x_{i}^{s}\). Then replace the conventional prior by a CMP prior at values \(\tilde{x}_{1}=\left(1,
Using data originally from Mullahy (1997) on smoking consumption (cigarettes smoked per day) by \(\mathrm{n}=807\) subjects, assess the suitability of a Poisson regression using a posterior
Using data on Irish education transitions (http://lib.stat.cmu.edu/datasets/irish.ed) compare logit and probit regression using the augmented data method and the dbern.aux function. Take \(P=2\)
Apply the Kuo-Mallick model to predictor selection in the nodal involvement data using a uniform prior on \(k\) between the extremes \(k=0.5\) and \(k=4\). So the variance in the normal prior for
In Example 3.6 use the WinBUGS jump RJMCMC interface to obtain the highest posterior probability model for the Chevrolet asking price data, assuming a prior \(P_{\text {ret }} \sim\)
In Example 3.7, define quantities \(s_{1}=\exp \left(\alpha_{3}+\beta_{33}\right)\) and \(s_{2}=\exp \left(\alpha_{2}+\beta_{32}\right)\), and by monitoring them obtain the probability that
In Example 3.8, compare models 1 and 2 (standard conditional logistic and nested logit) using LPML and DIC criteria, and also predictive classification success: how far predicted choice obtained by
In Example 3.9 (political involvement), compare predictive accuracy, DIC and LPML between the proportional odds logistic model and a model allowing the regression effect to differ by response
Compare the suitability of ordinal logistic and ordinal probit regression for the political involvement data using the posterior predictive criteria\[\operatorname{Pr}\left(z_{\text {rep },
In Example 4.1, estimate a variance transformation model with \(\log \left(\sigma_{i}^{2}\right)=\xi_{0}+\xi_{1} \mu_{i}\), and compare its fit with a constant variance normal linear regression using
Fit a Student \(t\) linear regression to astronomy data from Rousseeuw (1991) relating to the star cluster CYG OBI, which contains 47 stars in the direction of Cygnus. The predictor (
In Example 4.3, apply the existing ZIP regression, but using a mixed predictive check based on replicate \(t_{r e p, i} \sim \operatorname{Bern}(\zeta)\). This should show the proportion of
In Example 4.3 (hospital visits), apply a ZIP regression with the seven predictors now used to predict both \(\zeta_{i}\) and \(\mu_{i}\), and additionally with predictor selection (e.g. using
In Example 4.5 (byssinosis incidence), modify the existing code (for the mixture family and logit options) to find the probabilities of \(20 \%\) or more incidence in each of the 18 cells (analagous
In Example 4.7 (motorcycle data) use a one-dimensional thin plate spline function with random coefficients to model the non-linear effect. Use the mixed replicate predictive scheme to assess the
For the aspirin use data in Example 5.3, consider a scale mixing model as an alternative to normal random effects at the second stage. This is equivalent to a Student \(t\) second stage. Thuswhere
In the bivariate meta-analysis of data on true positives and negative \(\mathrm{CT}\) scan diagnoses, adopt a scale mixture of normals for the \(\theta_{i}=\left(\theta_{i 1}, \theta_{i 2}\right)\)
In Example 5.6 (popularity data), and retaining separate univariate normal priors (i.e. model 1) on the cluster effects at stage 2 , assess the impact on inferences, such as on posterior means of
In Example 5.6, and assuming a Wishart prior \((R=0.01 I\) as the scale matrix \()\) on the second stage precision matrix (model 3), assess the assumption of level 1 homoscedasticity. It is suggested
In the bivariate multilevel analysis of the health outcome data, assess normality of the cluster effects in the third model. For example, one may obtain posterior means of \(\left\{\beta_{j 10},
Consider data involving replicate measures \(i=1,2\) of anchovy larvae counts \(y_{i j}\) over \(j=\) 1, .. , 49 larvae pairs (Booth et al., 2003). Predictors in a negative binomial regression are
Consider data on 3 month market yield on US Treasury securities from January 1983 to December \(2012(T=360)\) (from wikiposit). Classical estimation (e.g. using the arima routine in \(\mathrm{R}\) )
Apply the same model sequence used in Example 6.2 to the extended velocity series (1869-1988) (e.g. Koop and Steel, 1994). This series is identified as non-stationary (with significant probability
In Example 6.3, estimate an MA(1) model in the differences \(z_{t}\) of the wholesale prices index using both methods applied in that example.Data from Example 6.3 Consider data from Enders (2004, p.
In Example 6.6 (benefit claimants), re-estimate the INAR models to include the a latent preseries data point, by setting a prior \(y_{0} \sim \operatorname{Po}\left(\omega_{0}\right)\), where
In Example 6.6 (benefits claimants), estimate the following autoregressive model (Fokianos, 2012)Data from Example 6.6 y ~ P(), V = log(,), v, X,+ p log(y-1+1)+ PV-1 where X, includes the seasonal
Consider the prediction of months classed as economic recession \(\left(y_{t}=1\right)\) as against growth \(\left(y_{t}=0\right)\) for the US (the data are contained in the file Exercise 6.6. odc)
In Example 6.8, fit a true series evolving according to an RW(3) model such that\[\theta_{t}=3 \theta_{t-1}-3 \theta_{t-2}+\theta_{t-3}+\omega_{t}\]and assess fit against the RW(2) model.Data from
The observed Intel monthly returns for 2004 (the first comparing January 2004 with December 2003) are \(-0.047,-0.042,-0.069,-0.054,0.112,-0.033,-0.117\), \(-0.125,-0.058,0.109,0.007\), and 0.045.
In Example 6.13 (Hepatitis A) estimate a latent category Markov chain model with \(m=4\) states, and identify the time points with highest probabilities of shifts into the highest incidence
In Example 6.14 (US unemployment), fit model A with an unknown variance for the level shifts \(v_{t}\). How does this affect the ratio of maximum to minimum CPO?Data from Example 6.14 As an
Consider the lynx data available in the \(\mathrm{R}\) package tsDyn. These data consist of annual totals of Canadian lynx trapped in the Mackenzie River district of NW Canada during 1821-1934.
In Example 7.2 include a posterior predictive check (e.g. using the Jarque-Bera criterion) of the normality of the permanent effects \(\left(b_{i}\right)\) in model 2.Data from Example 7.2 As an
In Example 7.3 investigate whether an improved fit results from making all the lag coefficients random (including lags 1 and 3 ), and with all mean lag coefficients unknown (including the mean of
In Example 7.3, assess sensitivity of predictive fit (the sum of squared deviations between observations and predictive replicates) and inferences regarding the diet coefficient \(\beta\) to
In Example 7.4 on employment histories, estimate model 2 using a lag in \(y\) rather than \(Z\), and also apply the random effects selection of Section 7.5, to assess whether there is exclusive true
In model 1 of Example 7.5, try a scale mixture version for the random effects \(b_{i, 1: 2}\) with degrees of freedom an unknown (equivalent to a bivariate Student \(t\) ). The analysis may be
In Example 7.9, include a predictor selection mechanism (see Chapter 3) in the MNAR version of the missing data logit regression for \(\eta_{i}\) (under a selection model approach). How does this
In Example 7.9, include code to derive the posterior predictive loss under the two selection models (MAR and MNAR missing data options) using the criterion \(H_{i}=\) \(R_{i}\left(y_{i 2}-y_{i
Estimate a spatial errors model (without predictors) for the London TB data using a grid prior for the autocorrelation parameter, and compare its LPML with that under the pure spatial lag model.
A weighted grid prior for the spatial correlation coefficient in spatial lag and error models can be specified (Lesage, 1997) as\[\pi(\lambda) \propto(s+1) \lambda^{s} I_{(0,1)}(\lambda)\]where the
Estimate the spatial error model for the Columbus crime data retaining the grid prior and special likelihood approach, but with Student \(t\) distributed disturbances \(u_{i}\). The precision for the
Estimate the convolution model, together with predictors \(X_{1}-X_{4}\) for the English crime rate data, namely\[\log \left(\theta_{i}\right)=\alpha+X_{i} \beta+\epsilon_{i}+u_{i}\]where the
In Example 8.6, assess the impact on the profile of neighbourhood psychosis prevalence rates per \(1000, R_{k}=1000 \theta_{k, n e i}\) (e.g. levels of skewness in such prevalence rates) and
Apply a 3PL model (including guessing parameters) to the Law School Admission Test (LSAT) data in the R program ltm. The 2PL model may be fitted with the code ~ (F[i] ~ dnorm (0,1) dbern (pi [i,m])
Writing for the least-squares and ridge regression estimators for regression coefficients θ, show thatwhile its variance-covariance matrix is Deduce expressions for the sum /of the squares of the
In Section 3.8, we discussed Newcomb's observation that the front pages of a well-used table of logarithms tend to get dirtier than the back pages do. What if we had an antilogarithm table, that is,
With the data in the example in Section 3.4 on 'The Poisson distribution', would it be appropriate to reject the hypothesis that the true mean equalled the prior mean (i.e. that λ = 3)? [Use
Two different microscopic methods, A and B, are available for the measurement of very small dimensions in microns. As a result of several such measurements on the same object, estimates of variance
A report issued in 1966 about the effect of radiation on patients with inoperable lung cancer compared the effect of radiation treatment with placebos. The numbers surviving after a year were:What
From the approximationwhich holds for large n, deduce an expression for the log-likelihood L(pΙx, y) and hence show that the maximum likelihood occurs when ρ = r. An approximation to the
Suppose that the density function p(xlθ) is defined as follows for x = 1, 2, 3, ... and θ = 1, 2, 3, .... If θ is even, thenShow that, for any x the data intuitively give equal support to the
Suppose that x1, x2,... is a sequential sample from an N(θ, 1) distribution and it is desired to test H0 : θ = θ0 versus H1 : θ ≠ 0o. The experimenter reports that he used a proper stopping
Show that in importance sampling the choice minimizes w(x) even in cases where f(x) is not of constant sign. p(x) = |f(x)|q(x) ƒ\ƒ (§) \q (§) d
Suppose that x ~ C(0, 1) has a Cauchy distribution. It is easily shown that η = P(x > 2) = tan-1(½)/π = 0.147 583 6, but we will consider Monte Carlo methods of evaluating this probability.(a)
Apply sampling importance re-sampling starting from random variables uniformly distributed over (0, 1) to estimate the mean and variance of a beta distribution Be(2,3).
Apply the methodology used in the numerical example in Section 10.2 to the data set used in both Exercise 16 on Chapter 2 and Exercise 5 on Chapter 9.Exercise 16Suppose that the results of a certain
Find the Kullback-Leibler divergence (q : p) when p is a binomial distribution B (n, π) and q is a binomial distribution B(n, p). When does (q : p) = (p: q)?
Find the Kullback-Leibler divergence(q : p) when p is a normal distribution N(µ,ϕ) and q is a normal distribution N(v,ψ).
Let p be the density 2(2π)-1/2 exp(-½x2) (x > 0) of the modulus x = lzl of a standard normal variate z and let q be the density β-1 exp(-x/β) (x > 0) of an E(β) distribution. Find the
The paper by Corduneanu and Bishop (2001) referred to in Section 10.3 can be found on the web athttp://research. microsoft.com/pubs/67239/ bishop-aistats01.pdf.Hardle's data set is available in R by
Carry out the calculations in Section 10.4 for the genetic linkage data quoted by Smith which was given in Exercise 3 on Chapter 9.Exercise 3Smith (1969, Section 21.10) quotes an example on genetic
A group of n students sit two exams. Exam one is on history and exam two is on chemistry. Let xi and yi denote the ith student's score in the history and chemistry exams, respectively. The following
Suppose that, in a Markov chain with just two states, the probabilities of going from state i to state j in one time unit are given by the entries of the matrix in which i represents the row and j
Smith (1969, Section 21.10) quotes an example on genetic linkage in which we have observations x = (x1, x2 , x3 , x4) with cell probabilitiesThe values quoted are x1 = 461, x2 = 130, x3 = 161 and X4
Dempster et al. (1977) define a generalized EM algorithm (abbreviated as a GEM algorithm) as one in which Q(θ(t+1),θ(t)) ≥(θ(t),θ(t)) Give reasons for believing that GEM algorithms converge to
In question 16 in Chapter 2, we supposed that the results of a certain test were known, on the basis of general theory, to be normally distributed about the same mean µ, with the same variance ∅,
A textile company weaves a fabric on a large number of looms. Four looms selected at random from those available, and four observations of the tensile strength of fabric woven on each of these looms
Write computer programs in C++ equivalent to the programs in R in this chapter.
Use the data augmentation algorithm to estimate the posterior density of the parameter n in the linkage model in question 3.Question 3.Smith (1969, Section 21.10) quotes an example on genetic linkage
Find the mean and variance of the posterior distribution of θ for the data in question 5 mentioned earlier using the prior you derived in answer to that question by means of the Gibbs sampler (
The following data represent the weights of r = 30 young rats measured week! y for n = 5 weeks as quoted by Gelfand et al. (1990), Tanner (1996, Table 1.3 and Section 6.2.1), Carlin and Louis (2000,
In bioassays, the response may vary with a covariate termed the dose. A typical example involving a binary response is given in the following table, where R is the number of beetles killed after 5
Observations x1, x2, ... , xn are independently distributed given parameters θ1, θ2, ... ,θn according to the Poisson distribution p(xi ∣θ) = θixi exp(-0i)/xi. The prior distribution for θ is
Carry out the Bayesian analysis for known overall mean developed in Section 8.2 mentioned earlier (a) with the loss function replaced by a weighted mean and (b) with it replaced by 2(0,6) = Σω;;
Compare the effect of the Efron-Morris estimator on the baseball data in Section 8.3 with the effect of a James-Stein estimator which shrinks the values of πi towards π0 = 0.25 or equivalently
The Helmert transformation is defined by the matrix so that the element aij in row i, column j isIt is also useful to write aj for the ( column) vector which consists of the jth column of the matrix
Show thatwhere(Lehmann 1983, Section 4.6, Theorem 6.2). R(0, ÑJS+) < R(0, FJS) Ꮎ
Show that the matrix H in Section 8.6 satisfies BT H-1 B = 0 and that if Bis square and non-singular then H-1 vanishes.
Consider the following particular case of the two way layout. Suppose that eight plots are harvested on four of which one variety has been sown, while a different variety has been sown on the other
Generalize the theory developed in Section 8.6 to deal with the case where x ~ N(Aθ,ф) and θ ~ N and knowledge of µ, is vague to deal with the case where µ ~ N(Cv, K) (Lindley and Smith,
Find the elements of the variance-covariance matrix Σ for the one way model in the case where ni = n for all i.
Show that in any experiment E in which there is a possible value y for the random variable X̃ such that PX̃(y∣θ) = 0, then if z is any other possible value of X̃, the statistic t = t(x) defined
Consider an experiment E = {X̃, θ, p(xlθ)}. We say that censoring (strictly speaking, fixed censoring) occurs with censoring mechanism g (a known function of X̃) when, instead of X̃, one
A drunken soldier, starting at an intersection O in a city which has square blocks, staggers around a random path trailing a taut string. Eventually, he stops at an intersection (after walking at
Suppose that, starting with a fortune of f0 units, you bet a units each time on evens at roulette (so that you have a probability of 18/37 of winning at Monte Carlo or 18/38 at Las Vegas) and keep a
Let x1, x2 , ... be a sequential sample from a Poisson distribution P(λ). Suppose that the stopping rule is to stop sampling at time n ≽ 2 with probabilityfor n 2, 3, ... (define 0/0 = 1).
Show that the mean of the beta-Pascal distribution is given by the formula in Section 7.3, namely, = (S) p(S|R, r, s) = B(r" +s, R" − r" + S − s) B(r"-1, R" - r")
Suppose that you intend to observe the number x of successes in n Bernoulli trials and the number y of failures before the nth success after the first n trials, so that x ~ B (n,π) and y ~ NB(n,π).
The negative of loss is sometimes referred to as utility. Consider a gambling game very unlike most in that you are bound to win·at least £2, and accordingly in order to be allowed to play, you
Suppose that you want to estimate the parameter π of a binomial distribution B(n, π). Show that if the loss function isthen the Bayes rule corresponding to a uniform [i.e. Be(1, 1) I prior for n is
Let x ~ B(n, π) and y ~ B(n, p) have independent binomial distributions of the same index but possibly different parameters. Find the Bayes rule corresponding to the losswhen the priors for π and p
Investigate possible point estimators for π on the basis of the posterior distribution in the example in the subsection of Section 2.10 headed 'Mixtures of conjugate densities'.
Find the Bayes rule corresponding to the loss function L(0, a) u(a - 0) v (0 - a) if a ≤ 0 if a > 0.
Suppose that your prior for the proportion π of defective items supplied by a manufacturer is given by the beta distribution Be(2, 12), and that you then observe that none of a random sample of size

Showing 1 - 100 of 206