Question: Looking at the 955 counties in the US that have a population over 50,000 people (total of 955 counties). Model how the county voted in
Looking at the 955 counties in the US that have a population over 50,000 people (total of 955 counties).
Model how the county voted in the 2016 presidential election.
The response variable how the county voted (variable is Vote_Ordered), which has 4 values:
Value Label Vote Percent for Democrat
1 Rep Strong Less than 40%
2 Rep Weak 40% - 50%
3 Dem Weak 50%-60%
4 Dem Strong More than 60%
The two explanatory variables we will use are:
Region of the US (z): (MW/NE/S/W, 4 levels)
College (x): Percentage of people that attended at least some college - 5 # summary = {26, 52, 59, 65, 86}
1. Since the data are ordinal, we can use cumulative logistic regression to estimate the probabilities of the counties falling into one of the 4 voting types: RS, RW, DW, DS.
a. Write out the cumulative logit links that our model will predict in terms of js. No linear predictors required.
b. The model below shows the model estimates assume proportional odds for the changes in the predictors, including the interaction terms between region and education. Interpret estimate of "College" in the output on how the state votes for president. Keep in mind this is for a model with interaction!
call:
polr(formula = Vote_Type ~ Region * college, data = counties2, Hess = T)
coeffients :
value std. Error t value
RegionW 4.1347 1.4631 2 . 826
RegionMW 1. 3162 1. 3935 0.945
RegionS 1.2543 1. 3005 0.964
College 0.1263 0.0179 7 .040
RegionW: college -0.0702 0.0237 -2.968
RegionMW: College -0.0381 0.0229 -1. 665
Regions : College -0.0429 0.0217 -1.977
Intercepts :
value std. Error t value
Rep_strong | Rep_weak 6. 6921.072 6.240
Rep_weak | Dem_weak 7.806 1.081 7.220
Dem_weak I Dem_Strong 8. 890 1.091 8. 148
Residual Deviance: 2088. 544
AIC: 2108.544
Repeat part 2b), but compare the odds of a county in the NE vs S when the college percentage is 50% for both counties.
d. Calculate the probability a county in the West is in the "Weak Dem" category if it has a college percentage of 58%.
e. The deviance for the model without interaction terms is 2097.67. Calculate the test statistic, df, p-value, and state the conclusion if the interaction term should be kept in the model.
f. An additive model (no interaction) was fit with non-parallel slopes and has a deviance of 2066.6. Is there evidence that the proportional odds assumption is not true for the data? Calculate the test statistic, df, p-value, and state the conclusion.
3. For any cumulative logistic regression model with proportional odds, 1 < 2 < < J1. Why?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
