All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Ask a Question
Search
Search
Sign In
Register
study help
mathematics
categorical data analysis
Questions and Answers of
Categorical Data Analysis
For Table 4.3, let Y = 1 if a crab has at least one satellite, and Y = 0 otherwise. Using x = weight, fit the linear probability model.a. Use ordinary least squares. Interpret the parameter
For Table 4.2, refit the linear probability model or the logistic regression model using the scores (a) (0, 2, 4, 6), (b) (0, 1, 2, 3), and (c) (1, 2, 3, 4). Compare Î²Ì for
For Table 3.7 with scores (0, 0.5, 1.5, 4.0, 7.0) for alcohol consumption. ML fitting of the linear probability model for malformation has output.Interpret the model fit. Use it to estimate the
In the 2000 U.S. presidential election, Palm Beach County in Florida was the focus of unusual voting patterns (including a large number of illegal double votes) apparently caused by a confusing
For games in baseballs National League during nine decades, Table 4.6 shows the percentage of times that the starting pitcher pitched a complete game.a. Treating the number of games as
Refer to Problem 4.6. The sample mean and variance are 5.0 and 4.2 for treatment A and 9.0 and 8.4 for treatment B.a. Is there evidence of overdispersion for the Poisson model having a dummy variable
For Table 4.3, Table 4.7 shows SAS output for a Poisson loglinear model fit using X = weight and Y = number of satellites.a. Estimate E(Y) for female crabs of average weight, 2.44 kg.b. Use
Refer to Problem 4.7. Using the identity link with x = weight, µÌ = 2.60 + 2.264x, where Î²Ì = 2.264 has SE = 0.228. Repeat parts (a) through
Refer to Table 4.3.a. Fit a Poisson loglinear model using both W = weight and C = color to predict Y = number of satellites. Assigning dummy variables, treat C as a nominal factor. Interpret
In Section 4.3.2, refer to the Poisson model with identity link. The fit using least squares is µ̂ = –10.42 + 0.51x (SE = 0.11). Explain why the parameter estimates differ and why the SE values
For the negative binomial model fitted to the crab satellite counts with log link and width predictor, µ̂ = –4.05, β̂ = 0.192 (SE = 0.048), k̂–1 = 1.106 (SE = 0.197). Interpret. Why is SE
Table 4.8 shows the free-throw shooting, by game, of Shaq ONeaI of the Los Angeles Lakers during the 2000 NBA (basketball) playoffs. Commentators remarked that his shooting varied
Refer to Problem 4.6. The wafers are also classified by thickness of silicon coating (z = 0, low; z = 1, high). The first five imperfection counts reported for each treatment refer to z = 0 and the
Describe the purpose of the link function of a GLM. What is the identity link? Explain why it is not often used with binomial or Poisson responses.
For binary data, define a GLM using the log link. Show that effects refer to the relative risk. Why do you think this link is not often used?
For the logistic regression model with 3 > 0, show that (a) as x → ∞, π(x) is monotone increasing, and (b) the curve for π(x) is the cdf of a logistic distribution having mean – α/β and
Let Yi be a bin(ni, πi) variate for group i, i = 1, ...... N, with {Yi} independent. Consider the model that π1 = .... = πN. Denote that common value by π. For observations {yi} show that π̂ =
A binomial GLM πi = Φ(∑j βj xij) with arbitrary inverse link function Φ assumes that niYi has a bin(ni, πi) distribution. Find wi in (4.27) and hence cov͡ (β̂). For logistic regression,
A GLM has parameter β with sufficient statistic S. A goodness-of-fit test statistic T has observed value to. If β were known, a P-value is P = P(T ≥ to; β). Explain why P(T ≥ to | S) is the
Let yij be observation j of a count variable for group i, i = 1,...,I, j = 1,..., ni. Suppose that {Yij} ae independent Poisson with E(Yij) = µi.a. Show that the ML estimate of µi is µ̂i = y̅i =
Consider the class of binary models (4.8) and (4.9). Suppose that the standard cdf Φ corresponds to a probability density function ϕ that is symmetric around 0.a. Show that x at which π(x) = 0.5
For a study using logistic regression to determine characteristics associated with remission in cancer patients, Table 5.10 shows the most important explanatory variable, a labeling index (U). This
According to the Independent newspaper (London, Mar. 8, 1994), the Metropolitan Police in London reported 30,475 people as missing in the year ending March 1993. For those of age 13 or less, 33 of
Table 5.19 refers to a sample of subjects randomly selected for an Italian study on the relation between income and whether one possesses a travel credit card. At each level of annual income in
For the population of subjects having Y = j, X has a N(µj, σ2)distribution, j = 0,1.Using Bayes theorem, show that P(Y = 1|x) satisfies the logistic regression model with β = (µ1 – µ0))/σ2.
For an I × 2 contingency table, consider logit model (5.4).Given (πi > 0), show how to find (βi) satisfying βI = 0.
Let Yibe bin(ni, Ïi) at xi, and let pi= yi/ni. For binomial GLMs with logit link:a. For pi near Ïi, show thatb. Show that z1(t) in (5.23) is a linearized version of the ith
Using graphs or tables, explain what is meant by no interaction in modeling response Y and explanatory X and Z when:a. All variables are continuous (multiple regression).b. Y and X are continuous, Z
Show that the conditional ML estimate of θ satisfies n211 = E(n11) for distribution (3.18).
A Monte Carlo scheme randomly samples M separate I × J tables having observed margins to approximate Po = P(X2 ≥ X2o) for an exact test. Let P̂ be the sample proportion of the M tables with X2
Consider exact tests of independence, given the marginais, for the I × I table having nii = 1 for i = 1,.....I, and nij = 0 otherwise.Show that (a) tests that order tables by their probabilities,
Refer to Problem 3.42 and exact tests using X2with Hα: Ï1 Ï2. Explain why the unconditional P-value, evaluated at Ï = 0.5, is related to
A contingency table for two independent binomial variable has counts (3, 0 / 0, 3) by row. For H0: π1 = π2 and Hα : π1 > π2, show that the P-value equals 1/64 for the exact unconditional test
When a test statistic has a continuous distribution, the P-value has a null uniform distribution, P(P-value ≤ α) = α for 0 < α < 1. For Fisher’s exact test, explain why under the null,
Assume independence, and let pij= nij/n and ÏÌij= pi+p+j.a. Show that pij and ÏÌij are unbiased for Ïij = Ïi+ Ï+j.b.
Use a partitioning argument to explain why G2 for testing independence cannot increase after combining two rows (or two columns) of a contingency table.
For counts {ni}, the power divergence statistic for testing goodness of fit is
For testing independence, show that X2 ≤ n min (I – 1, J – 1). Hence V2 = X2 / [n min(I – 1, J – 1)] falls between 0 and 1 (Carmer 1946). For 2 × 2 tables, X2 / n is often called
Refer to Table 3.10. a. Using X2 and G2, test the hypothesis of independence between party identification and race. Report the P-values and interpret.b. Partition chi-squared into components
Refer to Table 3.10. In the same survey, gender was cross-classified with party identification. Table 3.11 shows some results. Explain how to interpret all the results on this printout. Table 3.10:
In a study of the relationship between stage of breast cancer at diagnosis (local or advanced) and a woman’s living arrangement, of 144 women living alone, 41.0% had an advanced case; of 209 living
Refer to Table 2.1. Partition G2for testing whether the incidence of heart attacks is independent of aspirin intake into two components. Interpret.Table 2.1: Cross-Classification of Aspirin Use and
Project Blue Book: Analysis of Reports of Unidentified Aerial Objects was published by the U.S. Air Force (Air Technical Intelligence Center at Wright-Patterson Air Force Base) ¡n May 1955 to
Table 3.12 classifies a sample of psychiatric patients by their diagnosis and by whether their treatment prescribed drugs.Partition chi-squared into three components to describe differences and
Refer to Table 7.8. For the combined data for the two genders, yielding a single 4 Ã 4 table, X2= 11.5 (P = 0.24), whereas using row scores (3, 10, 20, 35) and column scores (1, 3, 4,
A study on educational aspirations of high school students (S. Crysdale, Internat. J. compar. Sociol. 16: 19–36, 1975) measured aspirations with the scale (some high school, high school graduate,
Table 3.13 shows the results of a retrospective study comparing radiation therapy with surgery in treating cancer of the larynx. The response indicates whether the cancer was controlled for at least
A study considered the effect of prednisolone on severe hypercalcaemia in women with metastatic breast cancer (B. Kristensen et al., J. Intern. Med. 232: 237–245, 1992). Of 30 patients, 15 were
Consider a 3 × 3 table having entries, by row, of (4, 2, 0 / 2, 2, 2 / 0, 2, 4). Conduct an exact test of independence, using X2. Assuming ordered rows and columns and using equally spaced scores,
An advertisement by Schering Corp. in 1999 for the allergy drug Claritin mentioned that in a pediatric randomized clinical trial, symptoms of nervousness were shown by 4 of 188 patients on loratadine
Is θ̂ the midpoint of large- and small-sample confidence intervals for θ? Why or why not?
For comparing two binomial samples, show that the standard error (3.1) of a log odds ratio increases as the absolute difference of proportions of successes and failures for a given sample increases.
Using the delta method, show that the Wald confidence interval for the logit of a binomial parameter π is log [π̂/(1–π̂)] ± zα/2/√nπ̂(1–π̂). Explain how to use this interval to
For multinomial sampling, use the asymptotic variance of log Î¸Ì to show that for Yules Q the asymptotic variance of
An I Ã J table has ordered columns and unordered rows. Ridits (Bross 1958) are data-based column scores. The jth sample ridit is the average cumulative proportion within category j,The
Show that X2 = n∑∑(pij – pi+ p+j)2/pi+ p+j. Thus, X2 can be large when n is large, regardless of whether the association is practically important. Explain why this test, like other tests,
For a 2 × 2 table, consider H0: π11 = θ2, π12 = π21 = θ(1 – θ), π22 = (1 – θ)2.a. Show that the marginal distributions are identical and that independence holds.b. For a multinomial
The measure of association lambda for nominal variables (Goodman and Kruskal 1954) has V(Y) = 1 –max{π+j} and V(Y|i) = 1 – rnaxj{πj|i}. Interpret lambda as a proportional reduction in
When X and Y are ordinal with counts {nij}:a. Explain why the (n 2) pairs of observations partition into C + D + TX + TY TXY, where TX = ni + (ni+ 1)/2 pairs
For 2 Ã 2 tables, Yule (1900, 1912) introducedwhich he labeled Q in honor of the Belgian statistician Quetelet. It is now called Yules Q.a. Show that for 2 Ã 2
Suppose that {Yij} are independent Poisson variates with means {µij}. Show that P(Yij = nij) for all i, j, conditional on {Yi+ = ni}, satisfy independent multinomial sampling [i.e., the product of
When X and Y are conditionally dependent at each level of Z yet marginally independent, Z is called a suppressor variable. Specify joint probabilities for a 2 × 2 × 2 table to show that this can
Smith and Jones are baseball players. Smith has a higher batting average than Jones in each of K years. Is it possible that for the combined data from the K years, Jones has the higher batting
For a 2 × 2 × 2 table, show that homogeneous association is a symmetric property, by showing that equal XY conditional odds ratios is equivalent to equal YZ conditional odds ratios.
Let πij|k = P(X = i, Y = j|Z = k). Explain why XY conditional independence is πij|k = πi+|k π+j|k for all i and j and k.
Explain why for three events E1, E2 and E3 and their complements, it is possible that P(E1 | E2) > P(E1 | E̅2) even if both P(E1 | E2 E3) < P(E1 | E̅2 E3) and P(E1 |E2 E̅3) < P(E1
For given π1 and π2 show that the relative risk cannot be farther than the odds ratio from their independence value of 1.0.
Let D denote having a certain disease and E denote having exposure to a certain risk factor. The attributable risk (AR) is the proportion of disease cases attributable to that exposure.a. Let P(E̅)
Table 2.12 summarizes responses of 91 married couples in Arizona to a question about how often sex is fun. Find and interpret a measure of association between wifes response and
A study of the death penalty for cases in Kentucky between 1976 and 1991 (T. Keil and G. Vito, Amer. J. Criminal Justice 20: 17—36, 1995) indicated that the defendant received the death penalty in
At each age level, the death rate is higher in South Carolina than in Maine, but overall, the death rate is higher in Maine. Explain how this could be possible.
Based on 1987 murder rates in the United States, an Associated Press story reported that the probability that a newborn child has of eventually being a murder victim is 0.0263 for nonwhite males,
Table 2.10 refers to applicants to graduate school at the University of California at Berkeley, for fall 1973. It presents admissions decisions by gender of applicant for the six largest graduate
A 20-year cohort study of British male physicians (R. Doll and R. Peto, British Med. J. 2: 1525–1536, 1976) noted that the proportion per year who died from lung cancer was 0.00140 for cigarette
A research study estimated that under a certain condition, the probability that a subject would be referred for heart catheterization was 0.906 for whites and 0.847 for blacks.a. A press release
In an article about crime in the United States, Newsweek (Jan. 10, 1994) quoted FBI statistics for 1992 stating that of blacks slain, 94% were slain by blacks, and of whites slain, 83% were slain by
For adults who sailed on the Titanic on its fateful voyage, the odds ratio between gender (female, male) and survival (yes, no) was 11.4. a. What is wrong with the interpretation. “The
In the United States, the estimated annual probability that a woman over the age of 35 dies of lung cancer equals 0.001304 for current smokers and 0.000121 for nonsmokers (M. Pagano and K. Gauvreau,
A newspaper article preceding the 1994 World Cup semifinal match between Italy and Bulgaria stated that “Italy is favored 10–11 to beat Bulgaria, which is rated at 10–3 to reach the final.”
A study (E. G. Krug et al., Internat. J. Epiderniol., 27: 214-221, 1998) reported that the number of gun-related deaths per 100,000 people in 1994 was 14.24 in the United States, 4.31 in Canada, 2.65
Consider the following two studies reported in the New York Times.a. A British study reported (Dec. 3, 1998) that of smokers who get lung cancer, “women were 1.7 times more vulnerable than men to
Table 2.9 is based on records of accidents in 1988 compiled by the Department of Highway Safety and Motor Vehicles in Florida. Identify the response variable, and find and interpret the difference of
A diagnostic test has sensitivity = specificity = 0.80. Find the odds ratio between true disease status and the diagnostic test result.
An article in the New York Times (Feb. 17, 1999) about the PSA blood test for detecting prostate cancer stated: ‘The test fails to detect prostate cancer in 1 in 4 men who have the disease
The chi-squared mgf with df = ν is m(t) = (1–2t)–ν/2, for |t| < ½. Use it to prove the reproductive property of the chi-squared distribution.
For testing H0: Ïj= Ïj0j = 1,. . . ,c, using sample multinomial proportions {ÏÌj}, the likelihood-ratio statistic (1.17) isShow that G2 ¥ 0,
Refer to quadratic form (1.16).For the zs statistic (1.11), show that z2S = X2 for c = 2.
Genotypes AA, Aa, and aa occur with probabilities [θ2, 2θ(1 – θ), (1 – θ)2]. A multinomial sample of size n has frequencies (n1, n2, n3) of these three genotypes.a. Form the log likelihood.
For I × J contingency tables, explain why the variables are independent when the (I – 1) (J – 1) differences πj|i – πj|1 = 0, i = 1,......., I – 1, j = 1,........., J – 1.
A binomial sample of size n has y = 0 successes.a. Show that the confidence interval for π based on the likelihood function is [0.0, 1 – exp( –z2a/2/2n)]. For a = 0.05, use the expansion of an
Consider the 95% binomial score confidence interval for π. When y = 1, show that the lower limit is approximately 0.18/n; in fact, 0 < π < 0.18/n then falls in an interval only when y = 0.
Consider the Wald confidence interval for a binomial parameter π. Since it is degenerate when π̂ = 0 or 1, argue that for 0 < π < 1 the probability the interval covers π cannot exceed [1
For a flip of a coin, let π denote the probability of a head. An experiment tests H0: π = 0.5 against Ha: π ≠ 0.5, using n = 5 independent flips.a. Show that the true null probability of
For a binomial parameter π, show how the inversion process for constructing a confidence interval works with (a) The Wald test, (b) The score test.
A researcher routinely tests using a nominal P(type I error) = 0.05, rejecting H0 if the P-value ≤ 0.05. An exact test using test statistic T has null distribution P(T = 0) = 0.30, P(T = 1) = 0.62,
Inference for Poisson parameters can often be based on connections with binomial and multinomial distributions. Show how to test H0: µ1 = µ2 for two populations based on independent Poisson counts
Assume that y1, y2,. .., yn are independent from a Poisson distribution.a. Obtain the likelihood function. Show that the ML estimator µ̂ = y̅.b. Construct a large-sample test statistic for H0: µ.
A likelihood-ratio statistic equals to,. At the ML estimates, show that the data are exp(to/2) times more likely under Ha than under H0.
From Section 1.4.2 the midpoint ÏÌ´ of the score confidence interval for Ï is the sample proportion for an adjusted data set that adds z2a/2/2 observations of each
For a statistic T with cdf F(t) and p(t) = P(T = t), the mid-distribution function is Fmid(t) = F(t) – 0.5 p(t) (Parzen 1997). Given T = t0, show that the mid-P-value equals 1 – F(t0). (It also
Showing 400 - 500
of 540
1
2
3
4
5
6