Question: INFERENCES FOR CATEGORICAL DATA In this lab assignment, you will use descriptive, graphical, and inferential tools in R (or R Commander) to analyze the data

INFERENCES FOR CATEGORICAL DATA

In this lab assignment, you will use descriptive, graphical, and inferential tools in R (or R Commander) to analyze the data related to the passengers of the British ocean linerTitanicthat sank in 1912 after colliding with an iceberg. You will display and summarize the related categorical variables and explore the relationship between them with contingency tables. The significance of the bivariate relationships will also be assessed. Tests and confidence intervals for proportions will be used to compare the survival rates in selected passenger groups.

The Titanic Disaster

On April 15, 1912, during her maiden voyage, the British ocean linerTitanic, the largest ship afloat at the time, sank in the North Atlantic Ocean after colliding with an iceberg, sadly killing the vast majority of 2,224 passengers and crew. One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew.

In this lab assignment, you will use a dataset that describes gender, age, passenger class, and survival status of 1,187 of the 1,309 passengers on the Titanic. You may see that some groups of passengers were more likely to have survived than others. The data does not contain information for 885 crew members, but it does contain actual and estimated ages for about 80% of the passengers. Any passenger under 12 years of age was classified as a "child".

This dataset is based on theTitanic Passenger List, edited by Michael A. Findlay and originally published in Eaton & Haas (1994)Titanic: Triumph and Tragedy, Patrick Stephens Ltd, and expanded with the help of the internet community.

This dataset is available in theDatalink located in the Lab 3 tab display in the Labs section on eClass. Please import the data into R. (Hint: Students should use "Tabs" as "Field Separator" to import the data set into R Commander.) The data are not to be printed in your submission. The following is a description of the variables in the data file:

Variable Name Description of Variable

NAME PCLASS : full name of the passenger;

SURVIVED: passenger class (1 = 1st, 2 = 2nd, 3 = 3rd); used as proxy for socio-economic status (SES) 1st Upper class, 2nd Middle class, 3rd Lower class;

GENDER: gender (female or male)

AGE: age (in years); fractional if age is less than 1 year, NA for not available.

1. Use the data to answer Questions 1 - 5. 1. First discuss the data in the file.

(a) How many cases are there? What is the identifier variable? What is/are the categorical and numerical variable(s) in the data, if any?

(b) Is this an observational study or an experiment? Can the results of the study be extended to the population of interest which is all ships colliding with an iceberg? Are causal inferences possible?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!