Question: Subset the data set to include only those individuals who lived in an urban area. Cluster the individuals using a combination of numerical and categorical
Subset the data set to include only those individuals who lived in an urban area. Cluster the individuals using a combination of numerical and categorical variables. Determine the appropriate number of clusters and write a report to describe the differences between clusters.
| ID | Age | Urban | Mother_Edu | Father_Edu | Siblings | Black | Hispanic | White | Christian | WomenPlace | Male | FamilySize | Self_Esteem | Height | Weight | Outgoing_Kid | Outgoing_Adult | HealthPlan | Income | Marital_Status | Education | WeeksEmployed | NumberSpouses |
| 1 | 21 | 1 | 8 | 8 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | 65 | ||||||||||
| 2 | 20 | 1 | 5 | 8 | 8 | 0 | 0 | 1 | 1 | 1 | 0 | 5 | 16 | 62 | 120 | 0 | 1 | 1 | 0 | 1 | 12 | 0 | 1 |
| 3 | 18 | 1 | 10 | 12 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | 20 | 1 | 1 | 1 | 0 | 1 | 12 | 52 | 1 | ||
| 4 | 17 | 1 | 11 | 12 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | 67 | 110 | 0 | 1 | |||||||
| 5 | 20 | 1 | 12 | 12 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 4 | 23 | 63 | 130 | ||||||||
| 6 | 19 | 1 | 12 | 12 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 4 | 27 | 64 | 200 | 1 | 1 | 1 | 40000 | 1 | 16 | 52 | 1 |
| 7 | 15 | 1 | 12 | 12 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 3 | 26 | 65 | 131 | 0 | 1 | 1 | 25000 | 3 | 12 | 52 | 2 |
| 8 | 21 | 1 | 9 | 6 | 7 | 0 | 0 | 1 | 0 | 0 | 0 | 3 | 23 | 65 | 179 | 1 | 1 | 1 | 27400 | 3 | 13 | 52 | 2 |
| 9 | 16 | 1 | 12 | 10 | 4 | 0 | 0 | 1 | 1 | 0 | 1 | 6 | 26 | 66 | 145 | 1 | 1 | 1 | 52000 | 1 | 14 | 52 | 1 |
| 10 | 19 | 1 | 12 | 12 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 3 | 19 | 66 | 115 | 0 | 1 | ||||||
| 11 | 20 | 1 | 12 | 12 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 3 | 71 | 155 | 1 | 1 | 1 | 55000 | 0 | 16 | 52 | 0 | |
| 12 | 20 | 1 | 15 | 12 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 3 | 30 | 66 | 118 | 0 | 1 | ||||||
| 13 | 21 | 1 | 12 | 16 | 2 | 0 | 0 | 1 | 1 | 0 | 1 | 5 | 25 | 71 | 180 | 0 | 1 | 1 | 60000 | 2 | 16 | 52 | 1 |
| 14 | 16 | 1 | 12 | 12 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | 21 | 67 | 135 | 1 | 1 | 1 | 48000 | 2 | 18 | 52 | 1 |
| 15 | 15 | 1 | 12 | 12 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 4 | 23 | 73 | 185 | 1 | 0 | 1 | 0 | 1 | 16 | 0 | 1 |
| 16 | 21 | 1 | 12 | 12 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 4 | 25 | 63 | 130 | 1 | 1 | 1 | 38000 | 1 | 13 | 52 | 1 |
| 17 | 22 | 1 | 12 | 15 | 2 | 0 | 0 | 1 | 0 | 0 | 1 | 2 | 24 | 69 | 160 | 1 | 1 | 0 | 48000 | 0 | 13 | 52 | 1 |
| 18 | 21 | 1 | 12 | 16 | 2 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 28 | 69 | 155 | 1 | 1 | 1 | 120000 | 3 | 13 | 52 | 2 |
| 19 | 22 | 1 | 10 | 12 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 2 | 28 | 64 | 120 | 1 | 1 | ||||||
| 20 | 20 | 1 | 12 | 18 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | 21 | 64 | 120 | 0 | 1 | 1 | 52000 | 1 | 17 | 52 | 1 |
| 21 | 18 | 1 | 12 | 18 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | 28 | 62 | 133 | 1 | 1 | 1 | 82000 | 1 | 16 | 52 | 1 |
| 22 | 16 | 1 | 12 | 12 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 4 | 17 | 64 | 110 | 0 | 1 | 1 | 36000 | 1 | 16 | 52 | 1 |
| 23 | 21 | 1 | 12 | 12 | 2 | 0 | 0 | 1 | 1 | 0 | 1 | 5 | 72 | 175 | |||||||||
| 24 | 18 | 1 | 12 | 12 | 2 | 0 | 0 | 1 | 1 | 0 | 1 | 5 | 28 | 71 | 180 | 1 | 1 | ||||||
| 25 | 20 | 1 | 14 | 16 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 3 | 19 | 67 | 125 | 0 | 1 | 1 | 20000 | 1 | 14 | 52 | 1 |
| 26 | 17 | 1 | 16 | 17 | 2 | 0 | 0 | 1 | 1 | 0 | 1 | 4 | 25 | 67 | 136 | 1 | 1 | ||||||
| 27 | 19 | 1 | 14 | 20 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 4 | 23 | 63 | 123 | 1 | 1 | 1 | 13126 | 1 | 16 | 44 | 2 |
| 28 | 15 | 1 | 14 | 20 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 4 | 19 | 65 | 114 | 0 | 1 | 1 | 24000 | 3 | 13 | 52 | 3 |
| 29 | 19 | 1 | 0 | 4 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 4 | 30 | 67 | 146 | 0 | 1 | 1 | 50000 | 1 | 12 | 52 | 1 |
| 30 | 21 | 1 | 6 |
Step by Step Solution
3.38 Rating (164 Votes )
There are 3 Steps involved in it
To subset the data set to include only individuals who lived in an urban area we filter the rows where the Urban column has a value of 1 Then we cluster the individuals using a combination of numerica... View full answer
Get step-by-step solutions from verified subject matter experts
