Question: A five-year follow-up study was carried out in a certain metropolitan area to assess the relationship of diet and weight to the incidence of stomach

A five-year follow-up study was carried out in a certain metropolitan area to assess the relationship of diet and weight to the incidence of stomach cancer. Data were obtained on n = 2,000 subjects. The variables of interest for these data were
T = time (in months) until stomach cancer (SCA) was detected or time (in months) until either the subject was lost to follow-up or the study ended (often called the censoring time);
ST = event indicator status (1 if SCA detected, 0 if SCA not detected);
WTGP = weight group (1 = low, 2 = medlow, 3 = medhigh, 4 = high), with "low" the referent group;
DT = diet type (1 = high fiber diet, 2 = medium fiber diet, 3 = low fiber diet), with "high fiber diet" the referent group;
GEN = gender (0 = male, 1 = female);
AGEGP = agegroup (1 = 40-54 years, 2 = 55-69 years, 3 = 70+ years), with 40€”54 years being the referent group.
Suppose that one considers doing a Poisson regression analysis to assess the effects of diet type and weight on the development of stomach cancer (SCA), controlling for age and gender.10 To carry out such an analysis, organize the data as follows:
Step 1 Form combinations of categories over all four predictors (WTGP, DT, GEN, AGEGP) being considered; these category combinations will define the subgroups to be analyzed using Poisson regression. Since there are four categories of WTGP, three categories of DT, two categories of GEN, and three categories of AGEGROUP, the total number of subgroups will be (4 Ã— 3 Ã— 2 Ã— 3) = 72.
Step 2 For the 72 subgroups, count the number of persons who develop SCA in each subgroup, and denote this count variable as Y. Also, sum up the person-time information over all the persons in each subgroup, and call this variable PT.
Step 3 Use the 72 Y values as the counts and the 72 PT values as the person-time information to fit Poisson regression models to these data.
a. Based on the data organization just described, what is the "sample size" to be used for fitting a Poisson regression model to these data?
b. State a Poisson regression model (called model 1) that would model the natural log of the rate of development of stomach cancer as a linear function of the risk factors DT and WTGP, controlling for potential confounding and effect modification by the variables GEN and AGEGP. Consider only two-factor product terms involving exposure variables and control variables.
c. How would one modify the model in part (b) so that both the WTGP variable and the DT variable are treated as ordinal variables on a natural logarithmic scale? (In stating this modified model, called model 2, make sure to explicitly define the "transformed" WTGP and DT variables that would need to be used.)
d. Provide the model statement, including required options, that one would use with SAS's PROC GENMOD (or a program from a different computer package) to fit model 2, described in part (c) above.
e. Based on model 2 defined in part (c), give a formula for the rate ratio that compares a subject who has a low fiber diet and is in the high weight group to a subject who has a high fiber diet and is in the low weight group, controlling for GEN and AGEGP. (Assume nonzero interaction effects.)
f. Provide an expression for a 95% confidence interval for the rate ratio that compares a subject who has a low fiber diet and is in the high weight group to a subject who has a high fiber diet and is in the low weight group, controlling for GEN and AGEGP. (Assume nonzero interaction effects.)
g. Based on model 2, describe how one would carry out an overall test for significant interaction involving deviance statistics. (Make sure to state the null hypothesis, the test statistic, and the d.f. for the test statistic under the null hypothesis.)
h. Is the test described in part (g) for model 2 equivalent to carrying out a goodness of fit test for a no-interaction version of model 2 that does not contain any product terms? Explain briefly.
i. If model 1 is considered instead of model 2, is an overall test for significant interaction equivalent to carrying out a goodness-of-fit test for a no-interaction version of model 1 that does not contain any product terms? Explain briefly.
Suppose that the following Poisson ANOVA table resulted from fitting several different Poisson regression models to these data.

*{Ordinal) DT is represented by a single ordinal variable,
(Ordinal) WTGP is represented by a single ordinal variable,
(Nominal) DT is represented by 2 dummy variables,
(Nominal) WTGP is represented by 3 dummy variables.
j. Assuming no interaction of any kind between the risk factors (DT and WTGP) and either AGEGP and/or GEN, use the deviance values (e.g., a, b, c) in the above table to give an expression for the LR statistic that tests whether there is a significant difference between the (joint) effects of the nominal exposure variables€”that is, (nominal) DT and (nominal) WTGP€”controlling for AGEGP and GEN.
k. What are the degrees of freedom for the LR statistic described in part (j)?
1. Use the deviance scores in the above table to give an expression for the LR statistic for testing whether there is at least one significant interaction effect in model 1 (as defined in part (b)); that is, describe a chunk test for the interaction terms in model 1.
m. What are the degrees of freedom for the LR statistic described in part (1)?
n. Describe how one might carry out a (single) test of hypothesis to determine whether model 1 (as defined in part (b)) or model 2 (as defined in part (c)) fits the data better. In answering this question, state the null hypothesis, the test statistic, and its d.f. under the null hypothesis,
o. Assuming that a Poisson model is appropriate for these data, how could one criticize the significance testing method described in part (n) for comparing model 1 with model 2?
p. In what other way, using deviance statistic information (other than the test of hypothesis described in part (n)), can one evaluate whether model 1 or model 2 is better

Model Number Number of Parameters Deviance Variables in Model None (i.e., constant term only (ordinal) DT only (ordinal) WTGP only GEN only (nomina) DT only (ordinal) DT and (ordinal) WTGP AGEGP only AGEGP and GEN (nominal) WIGP only (nominal) DT and (nominal) WTGP AGEGR GEN, (ordinal) DT and (ordinal) WTGP 12 AGEGP GEN, (nominal) DT 13 14 15 and (nominal) WTGP Model 2 (from part (c) Model 1 (from parr (b)) m. Sarurated

Step by Step Solution

★★★★★

3.32 Rating (161 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock

a The sample size is 72 This is determined as the total number of counts being modeled by the Poisso... View full answer

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Document Format (1 attachment)

632-M-S-L-R (6162).docx

120 KBs Word File

Students Have Also Explored These Related Statistics Questions!

A five-year follow-up study was carried out to assess the relationship of diet and weight to the incidence of stomach cancer in 40- to 50-year-old males in a certain metropolitan area. Let K- denote...

In a study conducted by American Express, corporate clients were surveyed to determine the extent to which hotel room rates quoted by central reservation systems differ from the rates negotiated by...

A cross-sectional study was carried out to assess the relationship of alcohol and smoking to blood pressure in 2,500 men ages 20 years or older in four North American population groups, each group...

7.0: Organizational Strategy Overview (one paragraph describing the following charts) 7.1:CurrentOrganizationalStructure(chart)...

Prepare an informative speech for Dr. John L. LaMattina, President, Global Research and Development, that will be delivered telephonically to shareholders at the quarterly earnings call. The...

How would I construct aninformative speechfor Dr. John L. LaMattina, President, Global Research and Development, that will be delivered telephonically to shareholders at the quarterly earnings call....

As you review these studies, think about the differences in the study designs and approach to answering their respective research questions. Think about the strengths and limitations of each design....

A restaurant claims that the standard deviation of the lengths of serving times is 3 minutes. A random sample of 27 serving times has a standard deviation of 3.9 minutes. At = 0.01, is there enough...

A refrigeration cycle removes heat at a rate of 250 kJ/min from a cold space maintained at -10oC while rejecting heat to the atmosphere at 25oC. If the power consumption rate is 0.75 kW, determine if...

Why the position aligns with your career vision statement credit analyst

Are certain types of companies more likely to be successful in the marketplace with non-traditional representations of masculinity and femininity? If so, what are these types of companies and why do...

Refer to Exercise 16.64. Number of years of education of a person whose mother had 20 years of education.

Refer to Exercise 16.65. Position on the issue of whether government should reduce income differences of someone who works 50 hours perweek.

You are given the following six points: a. Determine the regression equation. b. Use the regression equation to determine the predicted values of y. c. Use the predicted and actual values of y to...

GTB, Inc., has a 20 percent tax rate and has $86,136,000 in assets, currently financed entirely with equity. Equity is worth $6 per share, and book value of equity is equal to market value of equity....

ABC common stock is expected to have extraordinary growth in earnings and dividends of 17% per year for 2 years, after which the growth rate will settle into a constant 6.60%. If the discount rate is...

Reliance Corporation sold 5,400 units of its product at a price of $29 per unit. Total variable cost per unit is $1550consisting of $14.60 in variable production cost and 50.90 in variable selling...