Question: Data Clean-up Dataset to be cleaned up can be downloaded via google drive https://drive.google.com/file/d/0B1D66qK8jxd0YTI3cUU4R1VDa0U/view?usp=sharing 1. Create a separate repository and push the attached dataset (dirty_data.csv)

Data Clean-up

Dataset to be cleaned up can be downloaded via google drive

https://drive.google.com/file/d/0B1D66qK8jxd0YTI3cUU4R1VDa0U/view?usp=sharing

1. Create a separate repository and push the attached dataset (dirty_data.csv)

2. Populate the missing values in the Area variable with an appropriate values (Birmingham, Coventry, Dudley, Sandwell, Solihull, Walsall or Wolverhampton)

3. Remove special characters, padding (the white space before and after the text) from Street 1 and Street 2 variables. Make sure the first letters of street names are capitalized and the street denominations are following the same standard (for example, all streets are indicated as str., avenues as ave., etc.

4. If the value in Street 2 duplicates the value in Street 1, remove the value in Street 2

5. Remove the Strange HTML column

Complete the cleanup code and push the changes to the repository.

Submit a link to the repository. The repository will contain:

Combined code (.r or .rmd)

Original (dirty) dataset

New (clean) dataset

Dataset can be found

https://drive.google.com/file/d/0B1D66qK8jxd0YTI3cUU4R1VDa0U/view?usp=sharing

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!