Question: Lab 1: Data Summaries Background Categorical Data A variable is classified as categorical when the outcomes fall into groups. The groups have no inherent order.

Lab 1: Data Summaries

Background

Categorical Data

A variable is classified as categorical when the outcomes fall into groups. The groups have no inherent order. When summarizing categorical data the typical numerical summary is reporting percentages and modes, where the graphical summaries are pie charts and bar charts.

Numerical Data

A variable is classified as numerical when the outcome fall into a range of values. The numbers specify an order. Continuous data can have any value in the range (fractions or decimals). Discrete data has to be a whole number. When summarizing numerical data there are two main options. The mean and standard deviation or the median and inter-quartile range (IQR). The mean and standard deviation are the best choice for symmetrical (non-skewed) data set, where the median and IQR are more representative of typical values in a skewed data set. The graphical summaries include histograms and box plots.

Lab

The lab today consists of two different data sets for which we will work on a data exploration. For each set you will describe the data with both graphical and numerical summaries and then make conclusions based on the findings. Both sets are linked below with a brief description. Be sure to use the correct set when answering lab questions.

Elderly Health

A study examined the medical records of elderly patients to determine whether there are difference between men and women in their calcium or inorganic phosphorous blood levels (both in millimoles per liter, mmol/L). Note: for the variable 'sex' male = 1 and female = 2

Observation Age Sex Lab Calcium InorgPhosph agegroup
1 78 2 4 2.53 1.07 3
2 72 2 4 2.5 1.16 2
3 72 2 4 2.43 1.13 2
4 73 2 4 2.48 0.81 2
5 73 2 4 2.33 1.13 2
6 73 2 4 2.13 0.84 2
7 65 2 4 2.55 1.26 1
8 68 2 4 2.45 1.23 1
9 89 1 4 2.25 0.65 5
10 84 1 4 2.43 0.84 4
11 71 1 4 2.4 1.1 2
12 80 2 5 2.25 1.1 4
13 80 2 5 2.18 1.49 4
14 76 2 5 2.55 1.23 3
15 70 2 2 2.38 1.42 2
16 70 2 5 2.3 1.16 2
17 71 2 4 2.6 1.32 2
18 70 2 5 2.2 1.07 2
19 70 2 5 2.38 1.13 2
20 66 2 5 2.63 1.13 1
21 76 1 5 2.2 0.9 3
22 76 1 5 2.18 0.84 3
23 68 1 5 2.15 0.52 1
24 69 1 5 2.3 1.36 1
25 76 2 3 2.53 1.07 3
26 70 2 3 2 0.97 2
27 71 2 3 2.23 0.94 2
28 70 2 3 2.43 1.42 2
29 74 2 3 2.5 0.87 2
30 74 2 3 2.33 0.94 2
31 74 2 3 2.4 1.16 2
32 70 2 3 2.5 1.23 2
33 69 2 3 2.5 1.07 1
34 82 1 3 2.35 0.9 4
35 84 1 3 2.25 0.94 4
36 70 1 3 2.45 0.84 2
37 68 1 3 2.33 0.87 1
38 84 2 2 2.33 0.94 4
39 76 2 2 2.43 1.29 3
40 76 2 2 2.38 1.03 3
41 75 2 2 2.4 1.18 3
42 75 2 2 2.28 3
43 71 2 2 2.35 1.32 2
44 72 2 2 2.28 1.07 2
45 73 2 2 2.48 1 2
46 72 2 2 2.48 1.42 2
47 70 2 2 2.35 1.03 2
48 74 2 2 2.63 1.23 2
49 71 2 2 2.45 1.26 2
50 73 2 2 2.75 0.9 2
51 72 2 2 2.4 1.19 2
52 71 2 2 2.48 1.13 2
53 69 2 2 2.45 1.19 1
54 65 2 2 2.45 1.13 1
55 69 2 2 2.33 0.97 1
56 68 2 2 2.43 1.1 1
57 69 2 2 2.6 1.29 1
58 67 2 2 2.45 1.19 1
59 88 1 2 2.35 0.87 5
60 77 1 2 2.4 1.26 3
61 76 1 2 2.4 1.39 3
62 76 1 2 2.38 1 3
63 76 1 2 2.33 1 3
64 77 1 2 2.35 1 3
65 78 1 2 2.58 1.1 3
66 78 1 2 2.23 0.87 3
67 75 1 2 2.43 1.23 3
68 75 1 2 2.4 1.19 3
69 70 1 2 2.48 0.87 2
70 74 1 2 2.33 1.23 2
71 71 1 2 2.5 0.97 2
72 70 1 2 2.35 1.13 2
73 70 1 2 2.53 1.42 2
74 73 1 2 2.55 1.13 2
75 67 1 2 2.45 1.26 1
76 69 1 2 2.33 1.16 1
77 68 1 2 2.43 0.97 1
78 67 1 2 2.38 0.97 1
79 82 2 1 2.33 1.61 4
80 79 2 1 2.28 1.23 3
81 78 2 1 2.13 1.19 3
82 75 2 1 2.4 1.26 3
83 79 2 1 2.48 0.97 3
84 75 2 1 2.2 1.23 3
85 78 2 1 1.19 3
86 75 2 1 2.5 1.29 3
87 73 2 1 2.2 0.94 2
88 71 2 1 2.5 1 2
89 73 2 1 2.35 0.94 2
90 74 2 1 2.7 1.19 2
91 71 2 1 2.5 1 2
92 73 2 1 2.2 0.94 2
93 72 2 1 2.45 1.58 2
94 70 2 1 2.25 1.16 2
95 71 2 1 2.5 1.26 2
96 74 2 1 2.38 1.26 2
97 73 2 1 2.3 1.32 2
98 72 2 1 2.2 0.94 2
99 69 2 1 2.65 1.29 1
100 69 2 1 2.33 1.07 1
101 66 2 1 2.3 1.13 1
102 65 2 1 2.53 1.07 1
103 66 2 1 2.3 1.13 1
104 68 2 1 2.4 0.97 1
105 2 1 2.25 1.36 1
106 65 2 1 2.28 1.1 1
107 68 2 1 2.3 1.19 1
108 86 1 1 2.25 0.9 5
109 80 1 1 2.48 0.77 4
110 80 1 1 2.48 1.23 4
111 81 1 1 2.18 0.97 4
112 78 1 1 2.3 1.16 3
113 77 1 1 2.25 1.07 3
114 79 1 1 2.55 1 3
115 79 1 1 2.25 1.23 3
116 75 1 1 2.28 1 3
117 75 1 1 2.48 1.13 3
118 78 1 1 2.18 1.13 3
119 75 1 1 2.1 1.1 3
120 79 1 1 2.15 1.26 3
121 79 1 1 2.28 1.13 3
122 77 1 1 2.18 1.16 3
123 73 1 1 2.38 0.84 2
124 72 1 1 2.2 0.77 2
125 74 1 1 2.25 0.94 2
126 70 1 1 2.35 0.84 2
127 70 1 1 2.45 0.84 2
128 72 1 1 2.3 0.87 2
129 71 1 1 2.33 1 2
130 70 1 1 2.33 1.23 2
131 72 1 1 2.1 1.29 2
132 74 1 1 2.35 0.84 2
133 71 1 1 2.25 0.97 2
134 71 1 1 2.15 1 2
135 70 1 1 2.33 1.07 2
136 71 1 1 2.33 0.87 2
137 72 1 1 2.28 1 2
138 74 1 1 2.38 1.03 2
139 68 1 1 2.18 1.13 1
140 67 1 1 2.35 0.97 1
141 65 1 1 2.35 0.77 1
142 69 1 1 2.23 1.23 1
143 69 1 1 2.2 1.32 1
144 66 1 1 2.33 1.23 1
145 67 1 1 2.25 1.26 1
146 67 1 1 2.35 1.26 1
147 67 1 1 2.5 0.84 1
148 68 1 1 1.9 1.32 1
149 65 1 1 2.05 1.19 1
150 68 1 1 2.4 1.19 1
151 66 1 1 2.3 0.97 1
152 69 1 1 2.23 1.03 1
153 68 1 1 2.3 1.13 1
154 68 1 1 2.43 1.16 1
155 69 1 1 2.25 1.07 1
156 68 1 1 2.4 1.42 1
157 67 1 1 2.25 1.07 1
158 67 1 1 2.28 1.32 1
159 67 1 1 2.18 0.97 1
160 68 1 1 2.5 1.36 1
161 66 1 1 2.4 1.29 1
162 65 1 1 2.28 1.16 1
163 65 1 1 2.1 1.16 1
164 69 1 1 2.3 1 1
165 67 1 1 2.38 0.84 1
166 68 1 1 2.43 1.16 1
167 80 2 6 2.35 1.1 4
168 76 1 6 2.23 1.16 3
169 66 2 6 2.4 1.19 1
170 72 2 2.3 1.36 2
171 73 2 6 2.48 1.36 2
172 72 2 6 2.4 1.19 2
173 74 2 3 2.2 1.19 2
174 72 2 3 2.45 1.07 2
175 71 2 3 2.18 1.1 2
176 67 2 6 2.3 1.26 1
177 77 2 4 2.65 0.97 3
178 70 2 4 2.5 0.97 2

Question 1

Using the Elderly Health Data set:

Make an appropriate graphical summary (you pick one, there are several options) of the 'sex' variable and paste it here (be sure to label it well).

sentence or two about why you chose this graph and what it tells us about the study.

Question 2

Using the Elderly Health Data set:

Make a pie chart of both 'agegroup' and 'age' (upload them here).

Which one is better and why?

For the one that isn't very useful, what would be a better way to display that data?

Question 3

Using the Elderly Health Data set:

Make side-by-side box plots to compare the calcium or the inorganic phosphorous by sex; (upload it)

Describe notable features of the graph. (What features do we discuss in boxplots?)

Regarding the research question: Is there a difference between the sexes for the calcium and inorganic phosphorous levels? What would you conclude and why?

Question 4

Using the Everglades Water Data set:

Pick one of the variables. Make a histogram and calculate summary statistics for it. (upload it here)

For each, describe the shape of the distribution and the whether the 5 number summary or mean and standard deviation describe it better; and why.

Question 5

Using the Everglades Water Data set:

Make a time plot of one of the variables. (Include it here)

Are there any noticeable trends?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!