Question: Assignment 3: Creating variables In this assignment, you will use a sample of the National Trauma Data Bank, the dataset is in the course directory

Assignment 3: Creating variables

In this assignment, you will use a sample of the National Trauma Data Bank,

the dataset is in the course directory with the name NTDB.

Up to this point you have just copied and run the statements. In this

assignment you will be writing some simple code of your own.

LIBNAME COH611 /courses/dd4f9595ba27fe300 ACCESS=READONLY;

HINT: You may have to delete the quotations marks and insert them back if you copy and paste this statement into SAS.

Part A: Exploring your data (2 points)

(Review assignment 1 if you don't remember how to write the SAS code to do this. Hint: Make sure you print the contents of the NTDB dataset and not the Survey2 datasetcheck the name!)

1. Print the contents of the dataset.

Part B: Data Management (8 points)

We are now going to create a dummy variable to indicate people who are 8 years and younger. You will need to use the age variable to do thisor you can use a WHERE statement.

Part A: Submit your program log and output (2 points)

Part B: Answer the following two questions: (8 points)

What was the mean emergency department length of stay (EDMIN) for patients 8 years and younger?

What was the median pulse rate (pulse1) of the participants who are older than 8 years?

What percentage of the hypotensive population is black?

How many of the people who died were male?

If you are having trouble with this assignment, use the 2^nd SAS assignment as a guide to help you write your code.

Here is assignment 2:

Assignment 2: Recoding Data

The unpleasant truth about data ...

Even with only a subset of the subjects and variables from the original study, and even with the data cleaned up pretty well prior to uploading the data set, you can see that there are some problems. If you only have 100 records, it might be feasible to go through these by hand and edit errors. However, if your data number in the thousands, it is not possible.

The Problem

Here is our problem we want to create subsets of our data so that we can look at medication and education for patients who are overweight (more than 200 lbs) and are hypotensive (systolic blood pressure <91mmHg).

The Solution

In this assignment you will learn two useful functions SAS has for recoding data, a few statements and two more procedures. The complete program is below, followed be a step by step explanation.

Enter it exactly as below except that you will need to change the first statement from --enter-class-directory-here to match the LIBNAME in the email you received from your professor.

LIBNAME coh611 /courses/dd4f9595ba27fe300 ACCESS=READONLY;

PROC FREQ DATA = coh611.survey2 ;

DATA cleandata ;

SET coh611.survey2;

If systolic_blood_pressure lt 91 then hypotensive = 1;

Else hypotensive = 0;

If weight gt 200 then overweight = 1;

Else overweight = 0;

PROC SORT DATA = cleandata ;

BY hypotensive;

PROC UNIVARIATE DATA = cleandata;

VAR age;

BY hypotensive;

Proc freq data = cleandata;

Where overweight = 1;

Tables race;

Proc freq data = cleandata;

Where overweight = 0;

Tables medication;

RUN ;

Part A: Submit your program log and output (2 points)

Part B: Answer the following two questions: (8 points)

What was the mean age of the participants who were NOT hypotensive?

What was the median age of the participants who were hypotensive?

What percentage of the overweight population is black?

How many of the people who are not overweight take Xarelto?

NOTE that all statements end in a semi-colon!

Explanation of the program

LIBNAME coh611 "--enter-class-directory-here" ACCESS=READONLY;

This statement specifies the directory where the data set is located and also that students only have read access to that data, they cannot delete or modify the data.

PROC FREQ DATA = coh611.survey2 ;

The first statement begins the frequency procedure and specifies the dataset to be used. The TABLES statement specifies the variable(s) for which frequency tables should be produced. If this statement is omitted, SAS produces frequency tables for every variable in the data set.

DATA cleandata ;

SET coh611.survey2;

The first statement creates a new data set named cleandata. The second statement reads into that new dataset data from the set named coh611.survey2.

If systolic_blood_pressure lt 91 then hypotensive = 1;

Else hypotensive = 0;

If weight gt 200 then overweight = 1;

Else underweight = 0;

In this statement, we are telling SAS to look at the variables systolic blood pressure and every value that is less than 91 should be considered hypotensive and everything else is not hypotensive. We created a dummy variable for blood pressure called hypotensive, which allows us to distinguish which patients are hypotensive or not based on the values in the blood pressure field. We do the same thing for the weight value.

PROC SORT DATA = cleandata ;

BY hypotensive;

Before you can get means, or many other statistics by group, you must sort the dataset. The first statement begins the SORT procedure and specifies the data set. The second statement gives the variable to sort by.

PROC UNIVARIATE DATA = cleandata;

VAR age;

BY hypotensive;

The final step is to get the detailed statistics (mean, median, mode, quintiles) for the age of the patients, by hypotension.

Proc freq data = cleandata;

Where overweight = 1;

Tables race;

Proc freq data = cleandata;

Where overweight = 0;

Tables medication;

RUN ;

The final statements allow us to produce frequency tables based on the classification of patients if they are underweight or not. Putting the where statements will allow you to look at the results of only those patients you have specified.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

ASSIGNMENT #4 In this assignment, will use a sample of the National Trauma Data Bank, the dataset is in the course directory with the name NTDB. Up to this point you have just copied and run the...

Java Assignment 3 Idea: We have all had to take quizzes, whether for fun or as a requirement (ex: this course). In this assignment you will implement a simple multiple choice quiz, using good...

Week 2: Understanding and Exploring Assumptions You will submit one Word document, including your SPSS output. 1. Why do we care whether the assumptions required for statistical tests are met? (Tip:...

Calling on my Chegg experts, I need your help on this long difficult task I have to complete this week. Below is the task for this week. Since many people had problems with this weeks task the...

Business Research MethodologyQuestion Bank 1 1. When the marketing department of an organization attempts to determine the amount of time the managers in this department spend at their computers in...

10. An experimenter has some degree of control over the: a. independent variable. b. correlative variable. c. history effect. d. All of the above, if the experiment is conducted properly. 11. If a...

Part A (Items #1 and #12 are required but not graded) You will submit one file, a Word document. Please limit each response to 250 words or less. Name the file in the following format:...

Drive (miles) 4 6 6 20 25 25 28 29 33 36 36 36 36 36 40 42 42 54 54 55 63 63 71 71 73 76 76 76 78 80 80 80 80 88 94 State MI SC OR MI GA CA NY TX SC TX CA MI TX OR IL FL IL FL NY OH PA OR MI MI FL NY...

BA 1605: Midterm Recap (Due: Feb. 27, 2015) Name _____________________________ 50 Student ID _____________________________ Section 01B 10:00~11:20 am Section 02B 01:00~02:20 pm [Questions 4 ~ 7] The...

Ashford 5: - Week 4 - Assignment Principles of Effective Intervention There are four general principles of effective intervention that have become organizing concepts of community corrections. They...

At the beginning of the month, the Forming Department of Martin Manufacturing had 10,000 units in inventory, 30% complete as to materials, and 10% complete as to conversion. During the month the...

Distinguish between a cost center, a profit center, and an investment center.

Interest rate price risk is Group of answer choices the period of time an investor expects to hold an investment. the risk that bond prices may decline due to possible default. the risk that the bond...

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

KEY QUESTION What is the multiplier effect? What relationship does the MPC bear to the size of the multiplier? The MPS? What will the multiplier be when the MPS is 0, .4, .6, and 1? What will it be...

LAST WORD Suppose that stock prices were to fall by 10 percent in the stock market. All else equal, would the lower stock prices be likely to cause a decrease in real GDP? How might they predict a...

ADVANCED ANALYSIS Linear equations for the consumption and saving schedules take the general form C a bY and S a (1 b)Y, where C, S, and Y are consumption, saving, and national income,...