Question: I need code for R to do the following for a large dataset. 1. Remove all colums that have more than 2000 NA values in
I need code for R to do the following for a large dataset.
1. Remove all colums that have more than 2000 NA values in them
2. Remove all rows that have any NA values in them.
This is my chunk, but my dim are way off.
NA.cols <- colSums(is.na(census)) keep.col <- NA.cols < 2000 keep.col census.2 <- census[, keep.col] census.2 <- cbind(NA.sum = rowSums(is.na(census)),census) census.2 <- filter(census.2, NA.sum==NA) census.2 <-select(census.2, -NA.sum) dim(census.2)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
