Question: LIST OF VARIABLES: Variable name Variable Description ID Identification Code LOW Low Birth Weight (0 = Birth Weight >= 2500g, 1 = Birth Weight <
LIST OF VARIABLES:
| Variable name | Variable Description |
| ID | Identification Code |
| LOW | Low Birth Weight (0 = Birth Weight >= 2500g, 1 = Birth Weight < 2500g) |
| AGE | Age of the Mother in Years |
| LWT | Weight in Pounds at the Last Menstrual Period |
| RACE | Race (1 = White, 2 = Black, 3 = Other) |
| SMOKE | Smoking Status During Pregnancy (1 = Yes, 0 = No) |
| PTL | History of Premature Labor (0 = None, 1 = One, etc.) |
| HT | History of Hypertension (1 = Yes, 0 = No) |
| UI | Presence of Uterine Irritability (1 = Yes, 0 = No) |
| FTV | Number of Physician Visits during the First Trimester (0 = None, 1 = One, 2 = Two, etc.) |
| BWT | Birth Weight in Grams |
1) Examine the cross-tabulation of the variables Race by LOW. Along with the default output, generate the chi-square test of association. Complete this in SAS or R. Then answer following questions:
-Among the patients in Race =White/1 category, what percentage had a low birth weight baby? (1 point)
-Do you see a statistically significant (at the 0.05 level) association between Race and LOW? (1 point)
-What is the highest value of expected frequency? What does that mean? (0.5 point)
-What does odds ratio compare and what does say about the difference in odds between patients in Race = White category and Race = Other Category? (0.5 point)
2) Fit a simple logistic regression model with LOW as the outcome variable and LWT as the predictor variable. Request an odds ratio plot. Then answer following questions. Complete this in SAS.
-Do you reject or fail to reject the global null hypothesis that all regression coefficients of the model are 0? (0.5 points)
-Write the logistic regression equation. (0.5 points)
-Interpret the odds ratio for LWT. (0.5 points)
-What percent of variance is explained by the model? (0.5 points)
3)Fit a logistic regression model with LOW as the outcome variable and AGE, LWT, RACE and Smoke as the predictor variables. Request ROC Curve, Confusion Matrix and an odds ratio plot. Then answer following questions. Complete this in R.
-Do you reject or fail to reject the null hypothesis that all regression coefficients of the model are 0? (0.5 points)
-If you do reject the global null hypothesis, then which predictors significantly predict safety outcome? (0.5 points)
-Interpret the odds ratio for all significant predictors. (1 point)
-Comment about what you would select as a probability cut-off interpreting numbers from the ROC table (1 point)
4) Run a stepwise selection model with LOW as the outcome variable and AGE, LWT, RACE and Smoke (Use 0.05 as cut off for entry and exit of variables into models). Then answer following questions. Complete this in SAS.
-Which predictor variables are significant in this model? (1 point)
-Has the model fit improved, remained same or degraded compared to Model in Question.3? Answer this using following statistics: AIC and BIC. Comment and explain your reasoning.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
