Question: 2. Data Frame Management: (a) Download and import the diamond data set from Blackboard. (b) Remove the depth column and rename the columns x, y
2. Data Frame Management:
(a) Download and import the diamond data set from Blackboard.
(b) Remove the depth column and rename the columns x, y and z as length, width and depth.
(c) The missing values are encoded differently in the data set, i.e. missing, MISSING and NA. Find the occurrence of each of them.
(d) Assign NA to all missing values found above.
(e) Find proportion of missing values per column.
(f) Remove all missing values from the data set and create a new data set data.complete.
(g) Check the structure of the data set and convert them to the appropriate data types.
(h) For the numeric columns, replace the missing values with the column mean.
(i) Make the cut, color and clarity columns into factors.
(j) Reorder the data frame by carat variable in the descending order and output the first 6 rows. (Hint: ?order)
(k) Take a subset of the data frame so that diamonds with at least 0.2 carat, I color or above (D is the highest colorless diamond grade), VVS1 or VVS2 clarity, price between $330 to $400 are kept. How many dimonads in the data set satisfies this condition?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
