Question: Open notes, open WWW , work on your own. You should use R and load the following packages: rpart and rpart.plot. ( Place final answers

Open notes, open WWW, work on your own. You should use R and load the following packages: rpart and rpart.plot.
(Place final answers in provided cells adjacent to questions.)
(Round answers to this question to 2 decimal places)
The data file was recorded comma-separated-variable form in the file "whitewines". Let's import the "whitewines.csv" data
into R.
You might remember the lab you did on wine. We are going to do a similar activity as that lab with a new wine data set.
Remember that the winemaking industry, over the years, has invested heavily in data collection and machine learning
methods that may assist in creating high quality wine. A review written by a critic often determines whether a bottle ends up
on the top or bottom shelf.
We want to create a systematic way of mimicking 'expert' wine ratings. This could help winemakers identify key factors that
contribute to better-rated wines. This system will not suffer from the subjectivity that is inherent with human tastings, such as
mood and/or palate fatigue. A machine learning algorithm may result in better quality wine as well as a more objective,
consistent, and fair ratings.
The white wine data includes values of 11 chemical properties of a large sample of white wines. For each wine, a laboratory
analysis measured characteristics such as the acidity, sugar content, chloride, sulfur, alcohol, pH, density, and more. The
samples were then rated in a blind tasting by panels of no less than three judges on a quality scale ranging from 0(very bad)
to 10(excellent). In the case that judges disagreed on the rating, the median value was used.
a) What type of feature is your target?
A
b) How many examples and features do we have in the data set? (Use
comma between two answers, no space)
A
c) First, randomize (i.e. shuffle) the examples in the data set. To do this, set seed to 123 just before you specify runif(
to get the same results as professor. Let's assume you imported the "whitewines.csv" data as an object/vector called
"ww." Once you reshuffled take a look at the first few lines of the data frame. What is the value for "fixed.acidity" on
the very first observation?
A
d) Let's split the data into a Training set and Test set. Take the first
4400 examples for Training and the rest for Testing. Create 2 assigned objects here.
Training data set with the target feature, you can name it whatever that makes
sense to you.
Test data set with the target feature, you can name it whatever that makes sense
to you.
What is the number of observations and variables for your testing data set? (Use comma between two answers, no
space)
e) Let's create a regression tree model. Load "rpart", and
"rpart.plot". Use library () function for loading up the packages. How many leaf nodes do we have in this tree?
A
f) Which feature was the most predictive feature of the target?
A
g) Now let's make some predictions and find out how accurate those
predictions are by calculating the Mean Absolute Error (MAE). Create an object called p1 and assign all the predictions in it.
How many predictions do you see when you execute p1 in your R console?
A
h) What is the MAE?
A
 Open notes, open WWW, work on your own. You should use

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!