Question: Part I: Creating project 1. Create a self-contained project in RStudio (refer to the lecture Data, project, and native R plotting) 2. Create a new

Part I: Creating project 1. Create a self-contained project in RStudio (refer to the lecture Data, project, and native R plotting) 2. Create a new script under the src sub-directory in the project folder, and write all commands you use for this assignment in this script. Part II: Data preparation 3. Download the dataset HK_properties.xlsx from Moodle, and save it into the sub-directory named data into the project folder

HK_properties.xlsx:

Part I: Creating project 1. Create a self-contained project in RStudio (refer 4. Use R statement to import the data to a data frame called hkp. Do NOT import character fields as factors o You may use the parameter stringsAsFactors=FALSE 5. Use R statement to check the first 6 lines of hkp to make sure it was imported correctly. Part III: Data manipulation & analytics using R commands 6. Add a new row to hkp: id = '6 Kowloon Tong','Block 3', block = 'Middle Floor', direction = 'South', hkd.m = 10.2, year = 2007, room = 3, gross.area = 716; and set the rest columns to NA. o Hint: a viable approach is to use rbind() function 7. To check if your new row is added, print the last 10 rows of hkp. 8. Check the data types of columns. If some columns have wrong data types, correct it. o Hint: columns 5~11 are supposed to be numeric, convert them using as.numeric( ) function. Otherwise, calculations cannot carry on! 9. Make a new data frame called hkp1996 that contains all columns of hkp, but only for years from 1996 forward (include 1996). You are doing this because only in 1996 and after do all of the floor types have data 10. Put 4 columns of hkp1996, hkd.m, gross.area, price.ft.sq, and efficiency.ratio, into a new data frame named hkp1996.value. 11. Find the medians of hkd.m, gross.area, price.ft.sq, and efficiency.ratio in hkp1996.value using one single line of R code. Hint: you may use apply() with median() functions; remember to use parameter na.rm = T to cope with NA. 12. Calculate the mean for price.ft.sq at every floor type in hkp1996. Put it into a data frame called hkp1996mean. hint: you may use the aggregate function; watch out for NA values. 13. Repeat the previous step for hkp, and place the result in an object called hkpMean

14. Rename the two column names of hkp1996mean to be floor and price.1996.mean; rename the two column names of hkpMean to be floor and price.mean 15. Then, add a column to hkpMean called price.1996.mean that contains the means calculated in hkp1996mean. 16. In hkpMean, create another column in hkpMean called mean.diff that is the difference between price.1996.mean and price.mean (i.e., price.1996.mean - price.mean) 17. Write hkpMean to an Excel workbook file under the output sub-folder. Hint: the Excel file looks like this

to the lecture Data, project, and native R plotting) 2. Create a

Part IIII: Simple Chart 18. Make a scatter plot of price.ft.sq versus gross.area for year 1996 and later. Save the chart as image into a sub-directory named img

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!