Question:
In Example 4.5 (byssinosis incidence), modify the existing code (for the mixture family and logit options) to find the probabilities of \(20 \%\) or more incidence in each of the 18 cells (analagous to an LD20 rate). The observed incidence for long employment, positive smoking status and dusty workplaces in fact exceeds \(25 \%\) and a model better tuned to the data will predict this high incidence.
Data from Example 4.5
Transcribed Image Text:
An augmented data linear regression is applied to binary responses for n = 139 pregnant women, namely y = 1 if they planned to breastfeed their babies, and y = 0 for bottlefeed- ing. The data are of interest in containing several poorly fitted cases when standard regression techniques are applied, while application of robust regression methods may affect inferences on significant predictor effects (Heritier et al., 2009). There are nine predictors, seven of which are binary: x1, stage in pregnancy (beginning or end); x2, how the subject was themselves fed as infants (1 = some/all breastfeeding, or 0= only bottle); x3, how the subject's friends fed their babies (some/all breastfeeding, or only bottle); x4, if subject has a partner or not; x5, age; X6, age at which left full time education; x7, ethnic group (1 = non-white, 0= white); xg, whether subject currently smokes (1 = yes, 0 = no), and x, whether subject has ever smoked. A conventional augmented normal linear regression (equivalent to probit regression) shows significant positive effects of x3 and x7, and a negative effect of xg, though collinearity is indicated in an unexpected positive effect for x9. Augmented Student t regression is then applied, namely Z; ~ N(X;B, 1/2;) ; Ga(0.5v, 0.5v), ~ I(A, B;);