Question: This homework exercise refers to the same CDC dataset as Lab 2. In the lab, we explored the relationship between the categorical variables `genhlth`, a

This homework exercise refers to the same CDC dataset as Lab 2. In the lab, we explored the relationship between the categorical variables `genhlth`, a self reported measure of general health, and `smoke100`, an indicator for whether an individual has smoked 100 times in their life. In this assignment, we will explore the relationship between `genhlth` and `exerany`, where `exerany` is an indicator for whether or not an individual has exercised in the past month.

To start, you need to load our necessary packages, load the dataset, and tell R to order the 5 levels of `genhlth` (as in the lab). Remember to include all of this code in your RMarkdown document.

```{r, message=FALSE, warning=FALSE} library(tidyverse) library(openintro) cdc <- read.table("http://www.stat.uchicago.edu/~yibi/s220/labs/data/cdc.dat", header=TRUE) cdc <- cdc %>% mutate(genhlth = ordered(genhlth, levels=c("poor", "fair", "good", "very good", "excellent"))) ```

a) Using `mosaic::tally()`, give a two-way contingency table between `exerany` and `genhlth`. You may add margins if you like.

```{r, inslude=TRUE} install.packages ("mosaic") mosaic::tally(exemrany & genhlth, data=cdc) ```

b) What propotion of the sample has exercised in the past month? What proportion of the sample reports being in excellent health? *You may use a specific function from lab to compute these proportions, or you may extract the appropriate numbers from your table in part (a) and use R as a calculator. In either case, include your code and explain how you arrived at your answer.*

```{r} ### Include your R code here ``` **Written answer here**

c) Among those who exercised in the past month, what proportion of them report being in excellent health? What about among those who did not exercise in the past month? *There are a number of ways to arrive at these answers in R. The most direct way is to use prop.table(1) to get row percentages, but you may also use data subsetting, or extract the appropriate numbers from the contingency table and use R as a calculator. Any method is fine, but please show your work. *

```{r} ### Include your R code here ``` **Written answer here**

d) Make a stacked bar chart that represents the two-way table in part (a). Put the ``genhlth`` variable on the x-axis and color the bar stacks using ``exerany``.

```{r} ### Include your R code here ```

e) Make a mosaic plot that compares the self-rated general health between those who had exercised in the past month and those who hadn't. Based on the plots, which group had better self-rated general health?

```{r} ### Include your R code here ``` **Written answer here**

(f) Are the two variables exerany and genhlth independent? Justify your answer.

**Written answer here**

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!