Question: There are the steps you should take: 1 . 1 0 Variables - Reduce the number of variables you are working with to include only

There are the steps you should take:
1.10 Variables - Reduce the number of variables you are working with to include only the following:
a. The full dataset has about 100 variables in it. Reduce it to 10 variables. I have chosen 5 variables and I want you to choose 5 others. The reduction is done in a data step. See the starter SAS code at the end. Note, your other 5 variables should not be the same as others in the class. Choose your own. You would be working independently on this project.
b. Must include these variables.
i. Dmage (mothers age)
ii. Dfage (fathers age)
iii. Dbirwt (childs birthweight in grams)
iv. Meduc (mothers education in years, 12= high school)
v. Gestat (Gestational Age in weeks)
c. Choose 5 others from these:
i. Mrace (mothers race 1 is white, 2 is black, \(3-8\) other)
ii. Frace (fathers race 1 is white, 2 is black, \(3-8\) other)
iii. Dmar (married \(1=\) yes 2 single)
iv. Dlivord (delivered order -1 mothers first child, 2 mothers second child)
v. Nprevist (number of prenatal visits)|
vi. Csex (Childs \(\operatorname{sex}1=\) female)
vii. Primac (C-section \(1=\) yes)
viii. Tobacco (tobacco use 1= yes)
ix. Alcohol (alcohol use \(1=\) yes)
x. Wtgain (weight gain)
xi. Fmaps (5 minutes Apgar score)
When you are done with this step, you should have all the data for only 10 variables.
2. Remove missing data.
Once you have reduced the dataset to 10 variables. Remove the missing data for each variable. When doing this step, document each step with comments in the SAS code before the Data Step I want to see how you removed records with missing data from each step. For instance, let's say there were 150,000 observations before you started reducing records with missing data. Let's take mothers' age and remove all mothers' ages that are missing. Let's say that 1,000 records had missing mothers' age. Your reduced dataset for this first step would be 149,000
observations remaining. Put a comment in your code prior to the data step that describes how records were removed for each variable. Do this for all other variables you have chosen.
When you are done with this step, you should have one total data step which includes the code used to reduce the dataset from 100 variables to 10 variables and each of the step needed to remove all the records with missing data. Note 99 usually means missing data for most of these variables. Ask if you are sure on what values should be considered missing.
3. Recode section.
a. Dmage (mothers age)- create new variable called MotherAG in 4 to 5 different age groups. You choose the age groups breakdown.
b. Dfage (fathers age)- same thing for this variable
c. Meduc (mothers' education in years)- recode something that makes sense to you. I don't care how you picked the agegroup years. I'm only interested in how you coded what you chose. Call this new variable MotherEd.
d. Dbirwt (childs weight in grams)- create a new variable LBW (low birther weight). If the dbirwt is under 2500 then LBW = yes, 2500+= no
e. Gestat (gestational age in weeks) create new variable called FullTerm = yes if 37 or more weeks, code no otherwise
f. Look at the other 5 variables you choose and recode any variable with more than 2 levels. Choose something that makes sense.
g. Document these changes as comments in your SAS code
When you are done with this section, you should have one data step that includes everything so far. the reduction of variables, removal of missing data and now all recoding of variables.
4. Add labels of your choice to all variables. Add comments to describe your work.
5. Add formats of your choice to all variables. Add comments to describe your work.
6. Removed unnecessary variables.
`my 10 variables are;
;dmage - mothers age;
`dfage - fathers age;
`dbirwt - childs weight;
`meduc - mothers education;
'gestat - gestational age;
`Nprevist-number of prenatal visits;
`Csex- Childs sex 1= female;
`Tobacco- tobacco use 1= yes;
Wtgain- weight gain;
*Fmaps-5 minutes Apgar score;
` Step 1: Reduce dataset to 10 variables *
lata project.one;
set project.lbid99c;
keep Dmage Dfage Dbirwt Meduc Gestat Nprevist Csex Tobacco Wtgain Fmaps;
run;
Step 2: Remove records with missing Dmage (99 indicates missing)*/
1000 records removed */
lata project.one;
set project.lbid99c;
where Dmage ne 99;
run;
Step 2: Remove records with missing Dfage */
800 records removed */
lata project.one;
set project.lbid99c;
where Dfage ne 99;
un;
Step 2: Remove records with missing Dbirwt */
500 records removed */
lata project.one;
set project.lbid99c ;
where Dbirwt ne 99;
un;
This sas code is the code I have started but I am having a rough time keeping it going
There are the steps you should take: 1 . 1 0

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!