Given the two National Health Interview Survey datasets, the data dictionaries, and the scoring document, achieve the following 1 Clean the data in R a In the child dataset, identify and print the duplicate records with the same ID value Eliminate the duplicate record for each instance so there is only one of each ID b In the child dataset, identify and print records with values that are out of range for the following variables ( consult data dictionary ) Determine how best to deal with these values Document your decisions in the dataset cover sheet ( see 3 below ) i BSCNWPPL C ii BSCNWPLCS C iii BSCCHG C iv BSCHLOPPL C v BSCCRYALT C vi BSCCLMDWN C vii BSCFUSSY C viii BSCSTHE C ix BSCSCHD C x BSCPTSLP C xi BSCSTYSLP C xii BSCPRLKSL C c In the child dataset, convert the above listed variables into factors and define the levels according to the data dictionary d In the child dataset i Create a new variable AGECAT grouping the age of the child ( AGEP C ) into the following categories 1 0 7 9 9 2 8 1 2 9 9 3 1 3 1 8 ii Generate frequencies for the new age groups iii Generate a list of all records who have either had borderline diabetes or prediabetes Save these records to a new dataset called DIABETES 2 Generate a permanent cross sectional analysis data set a Create a new variable called HHX NUM that is the HHX variable value without the H 0 at the beginning ( contains only the last 5 numbers of the variable ) b Generate frequencies for sex, age, Hispanic ethnicity, and general health status

The Answer is in the image, click to view ...

Question: Given the two National Health Interview Survey datasets, the data dictionaries, and the scoring document, achieve the following: 1 . Clean the data in R:

Given the two National Health Interview Survey datasets, the data dictionaries, and the scoring

document, achieve the following:

1 .

Clean the data in R:

.

In the child dataset, identify and print the duplicate records with the same ID value.

Eliminate the duplicate record for each instance so there is only one of each ID

.

.

In the child dataset, identify and print records with values that are out of range for the

following variables

(

consult data dictionary

) .

Determine how best to deal with these

values. Document your decisions in the dataset cover sheet

(

see #

3

below

) .

.

BSCNWPPL

_

.

BSCNWPLCS

_

iii. BSCCHG

_

.

BSCHLOPPL

_

.

BSCCRYALT

_

.

BSCCLMDWN

_

vii. BSCFUSSY

_

viii. BSCSTHE

_

.

BSCSCHD

_

.

BSCPTSLP

_

.

BSCSTYSLP

_

xii. BSCPRLKSL

_

.

In the child dataset, convert the above listed variables into factors and define the levels

according to the data dictionary.

.

In the child dataset:

.

Create a new variable AGECAT grouping the age of the child

(

AGEP

_

)

into the

following categories:

1 . 0 - 7.99

2 . 8 - 12.99

3 . 13 - 18

.

Generate frequencies for the new age groups

iii. Generate a list of all records who have either had borderline diabetes or

prediabetes. Save these records to a new dataset called DIABETES.

2 .

Generate a permanent cross

-

sectional analysis data set.

.

Create a new variable called HHX

_

NUM that is the HHX variable value without the H

0

the beginning

(

contains only the last

5

numbers of the variable

) .

.

Generate frequencies for sex, age, Hispanic ethnicity, and general health status.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Refer to the Kelson Sporting Equipment problem (Chapter 2, Problem 24). Letting R = number of regular gloves C = number of catcher's mitts Leads to the following formulation: Max 5R + 8C s.t. R +...

Building Sustainable Organizations: The Human Factor Author(s): Jeffrey Pfeffer Source: Academy of Management Perspectives, Vol. 24, No. 1 (February 2010), pp. 34-45 Published by: Academy of...

MGMT4308 DR. HUANG Knowledge Base for General Environmental Analysis1 The general environment is composed of factors that can have dramatic effects on firm strategy. We divide the general environment...

Analyze the Design and Methodology in Two Quantitative Studies Recall the two quantitative studies you read for this lesson from the eReserves; use these to answer the following questions: Identify...

UALITY IMPROVEMENT AND PATIENT SAFETY WHAT IS QUALITY ? Appropriate medical application knowledge of with due regard to the balance between the hazard medical inherent intervention in every and the...

Study Guide Healthcare Statistics By Jacqueline K. Wilson, RHIA About the Author Jacqueline K. Wilson is a Registered Health Information Administrator (RHIA) who has more than ten years of experience...

Journal of Case Studies in Education Elementary teachers' experiences and perceptions of departmentalized instruction: A case study Alecia Strohl Valdosta State University Lorraine Schmertzing...

Before you began reading Chapter 2 you wrote down why you believe families are violent . Now, as you have read through the different theories that are discussed in Chapter 2 reflect on your original...

What makes this campaign a community-based social marketing campaign? A Community-Based Social Marketing Anti-littering Campaign: Be the Street You Want to See 23 Mine ok Hughes, Will McConnell and...

What makes this campaign a community-based social marketing campaign? Find other anti-littering campaigns and briefly describe them. Analyze what type of approach they use. Do they use...

1. Compute the absolute extrema (that is absolute max/min): 1 7xx)=x-/x 3 3 f(x) 2 x+2x. -1x1 2. Given the function f(x) = (a) Apply the 1st derivative test to locate absolute extrema (b) Find, if...

Two cars, A and B, start side by side and accelerate from rest. The figure shows the graphs of their velocity functions. (a) Which car is ahead after one minute? Explain. (b) What is the meaning of...

A measure of asset utilization is which of the following ratios? Multiple Choice 0 1 : 4 6 : 3 5 CF / Average Assets CFO / Net Income CF + R&D / Average Assets Average Assets / CFO

Question 10: Jack invested in a government bond that promised an annual yield to maturity of 6.4 percent. The bond pays coupons twice a year. What is the effective annual yield (EAY) on this...