Question: Assignment 3: Creating variables In this assignment, you will use a sample of the National Trauma Data Bank, the dataset is in the course directory

Assignment 3: Creating variables

In this assignment, you will use a sample of the National Trauma Data Bank,

the dataset is in the course directory with the name NTDB.

Up to this point you have just copied and run the statements. In this

assignment you will be writing some simple code of your own.

LIBNAME COH611 /courses/dd4f9595ba27fe300 ACCESS=READONLY;

HINT: You may have to delete the quotations marks and insert them back if you copy and paste this statement into SAS.

Part A: Exploring your data (2 points)

(Review assignment 1 if you don't remember how to write the SAS code to do this. Hint: Make sure you print the contents of the NTDB dataset and not the Survey2 datasetcheck the name!)

1. Print the contents of the dataset.

Part B: Data Management (8 points)

We are now going to create a dummy variable to indicate people who are 8 years and younger. You will need to use the age variable to do thisor you can use a WHERE statement.

Part A: Submit your program log and output (2 points)

Part B: Answer the following two questions: (8 points)

What was the mean emergency department length of stay (EDMIN) for patients 8 years and younger?

What was the median pulse rate (pulse1) of the participants who are older than 8 years?

What percentage of the hypotensive population is black?

How many of the people who died were male?

If you are having trouble with this assignment, use the 2nd SAS assignment as a guide to help you write your code.

Here is assignment 2:

Assignment 2: Recoding Data

The unpleasant truth about data ...

Even with only a subset of the subjects and variables from the original study, and even with the data cleaned up pretty well prior to uploading the data set, you can see that there are some problems. If you only have 100 records, it might be feasible to go through these by hand and edit errors. However, if your data number in the thousands, it is not possible.

The Problem

Here is our problem we want to create subsets of our data so that we can look at medication and education for patients who are overweight (more than 200 lbs) and are hypotensive (systolic blood pressure <91mmHg).

The Solution

In this assignment you will learn two useful functions SAS has for recoding data, a few statements and two more procedures. The complete program is below, followed be a step by step explanation.

Enter it exactly as below except that you will need to change the first statement from --enter-class-directory-here to match the LIBNAME in the email you received from your professor.

LIBNAME coh611 /courses/dd4f9595ba27fe300 ACCESS=READONLY;

PROC FREQ DATA = coh611.survey2 ;

DATA cleandata ;

SET coh611.survey2;

If systolic_blood_pressure lt 91 then hypotensive = 1;

Else hypotensive = 0;

If weight gt 200 then overweight = 1;

Else overweight = 0;

PROC SORT DATA = cleandata ;

BY hypotensive;

PROC UNIVARIATE DATA = cleandata;

VAR age;

BY hypotensive;

Proc freq data = cleandata;

Where overweight = 1;

Tables race;

Proc freq data = cleandata;

Where overweight = 0;

Tables medication;

RUN ;

Part A: Submit your program log and output (2 points)

Part B: Answer the following two questions: (8 points)

What was the mean age of the participants who were NOT hypotensive?

What was the median age of the participants who were hypotensive?

What percentage of the overweight population is black?

How many of the people who are not overweight take Xarelto?

NOTE that all statements end in a semi-colon!

Explanation of the program

LIBNAME coh611 "--enter-class-directory-here" ACCESS=READONLY;

This statement specifies the directory where the data set is located and also that students only have read access to that data, they cannot delete or modify the data.

PROC FREQ DATA = coh611.survey2 ;

The first statement begins the frequency procedure and specifies the dataset to be used. The TABLES statement specifies the variable(s) for which frequency tables should be produced. If this statement is omitted, SAS produces frequency tables for every variable in the data set.

DATA cleandata ;

SET coh611.survey2;

The first statement creates a new data set named cleandata. The second statement reads into that new dataset data from the set named coh611.survey2.

If systolic_blood_pressure lt 91 then hypotensive = 1;

Else hypotensive = 0;

If weight gt 200 then overweight = 1;

Else underweight = 0;

In this statement, we are telling SAS to look at the variables systolic blood pressure and every value that is less than 91 should be considered hypotensive and everything else is not hypotensive. We created a dummy variable for blood pressure called hypotensive, which allows us to distinguish which patients are hypotensive or not based on the values in the blood pressure field. We do the same thing for the weight value.

PROC SORT DATA = cleandata ;

BY hypotensive;

Before you can get means, or many other statistics by group, you must sort the dataset. The first statement begins the SORT procedure and specifies the data set. The second statement gives the variable to sort by.

PROC UNIVARIATE DATA = cleandata;

VAR age;

BY hypotensive;

The final step is to get the detailed statistics (mean, median, mode, quintiles) for the age of the patients, by hypotension.

Proc freq data = cleandata;

Where overweight = 1;

Tables race;

Proc freq data = cleandata;

Where overweight = 0;

Tables medication;

RUN ;

The final statements allow us to produce frequency tables based on the classification of patients if they are underweight or not. Putting the where statements will allow you to look at the results of only those patients you have specified.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!