Open notes, open WWW , work on your own You should use R and load the following packages rpart and rpart plot ( Place final answers in provided cells adjacent to questions ) ( Round answers to this question to 2 decimal places ) The data file was recorded comma separated variable form in the file whitewines Let's import the whitewines csv data into R You might remember the lab you did on wine We are going to do a similar activity as that lab with a new wine data set Remember that the winemaking industry, over the years, has invested heavily in data collection and machine learning methods that may assist in creating high quality wine A review written by a critic often determines whether a bottle ends up on the top or bottom shelf We want to create a systematic way of mimicking 'expert' wine ratings This could help winemakers identify key factors that contribute to better rated wines This system will not suffer from the subjectivity that is inherent with human tastings, such as mood and or palate fatigue A machine learning algorithm may result in better quality wine as well as a more objective, consistent, and fair ratings The white wine data includes values of 1 1 chemical properties of a large sample of white wines For each wine, a laboratory analysis measured characteristics such as the acidity, sugar content, chloride, sulfur, alcohol, p H , density, and more The samples were then rated in a blind tasting by panels of no less than three judges on a quality scale ranging from 0 ( very bad ) to 1 0 ( excellent ) In the case that judges disagreed on the rating, the median value was used a ) What type of feature is your target A b ) How many examples and features do we have in the data set ( Use comma between two answers, no space ) A c ) First, randomize ( i e shuffle ) the examples in the data set To do this, set seed to 1 2 3 just before you specify runif ( to get the same results as professor Let's assume you imported the whitewines csv data as an object vector called ww Once you reshuffled take a look at the first few lines of the data frame What is the value for fixed acidity on the very first observation A d ) Let's split the data into a Training set and Test set Take the first 4 4 0 0 examples for Training and the rest for Testing Create 2 assigned objects here Training data set with the target feature, you can name it whatever that makes sense to you Test data set with the target feature, you can name it whatever that makes sense to you What is the number of observations and variables for your testing data set ( Use comma between two answers, no space ) e ) Let's create a regression tree model Load rpart , and rpart plot Use library ( ) function for loading up the packages How many leaf nodes do we have in this tree A f ) Which feature was the most predictive feature of the target A g ) Now let's make some predictions and find out how accurate those predictions are by calculating the Mean Absolute Error ( MAE ) Create an object called p 1 and assign all the predictions in it How many predictions do you see when you execute p 1 in your R console A h ) What is the MAE A

The Answer is in the image, click to view ...

Question: Open notes, open WWW , work on your own. You should use R and load the following packages: rpart and rpart.plot. ( Place final answers

Open notes, open WWW

,

work on your own. You should use R and load the following packages: rpart and rpart.plot.

(

Place final answers in provided cells adjacent to questions.

)

(

Round answers to this question to

2

decimal places

)

The data file was recorded comma

-

separated

-

variable form in the file "whitewines". Let's import the "whitewines.csv

"

data

into

R .

You might remember the lab you did on wine. We are going to do a similar activity as that lab with a new wine data set.

Remember that the winemaking industry, over the years, has invested heavily in data collection and machine learning

methods that may assist in creating high quality wine. A review written by a critic often determines whether a bottle ends up

on the top or bottom shelf.

We want to create a systematic way of mimicking 'expert' wine ratings. This could help winemakers identify key factors that

contribute to better

-

rated wines. This system will not suffer from the subjectivity that is inherent with human tastings, such as

mood and

/

or palate fatigue. A machine learning algorithm may result in better quality wine as well as a more objective,

consistent, and fair ratings.

The white wine data includes values of

11

chemical properties of a large sample of white wines. For each wine, a laboratory

analysis measured characteristics such as the acidity, sugar content, chloride, sulfur, alcohol,

p H,

density, and more. The

samples were then rated in a blind tasting by panels of no less than three judges on a quality scale ranging from

0 (

very bad

)

10 (

excellent

) .

In the case that judges disagreed on the rating, the median value was used.

)

What type of feature is your target?

)

How many examples and features do we have in the data set?

(

Use

comma between two answers, no space

)

)

First, randomize

(

.

.

shuffle

)

the examples in the data set. To do this, set seed to

123

just before you specify runif

(

to get the same results as professor. Let's assume you imported the "whitewines.csv

"

data as an object

/

vector called

"

. "

Once you reshuffled take a look at the first few lines of the data frame. What is the value for "fixed.acidity" on

the very first observation?

)

Let's split the data into a Training set and Test set. Take the first

4400

examples for Training and the rest for Testing. Create

2

assigned objects here.

Training data set with the target feature, you can name it whatever that makes

sense to you.

Test data set with the target feature, you can name it whatever that makes sense

to you.

What is the number of observations and variables for your testing data set?

(

Use comma between two answers, no

space

)

)

Let's create a regression tree model. Load "rpart", and

"rpart.plot". Use library

()

function for loading up the packages. How many leaf nodes do we have in this tree?

)

Which feature was the most predictive feature of the target?

)

Now let's make some predictions and find out how accurate those

predictions are by calculating the Mean Absolute Error

(

MAE

) .

Create an object called p

1

and assign all the predictions in it

.

How many predictions do you see when you execute

p 1

in your

R

console?

)

What is the MAE?

Open notes, open WWW, work on your own. You should use

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Question 3 ( Mandatory ) ( 3 0 points ) Open notes, open WWW , work on your own. You should use R and load the following packages: tm , stringr, and wordcloud. ( Place final answers in provided cells...

Resolving agency issues in client-contractor relationships Resolving agency issues in client-contractor relationships Resolving agency issues in client-contractor relationships A FACELIFT FOR A...

Welcome! Please read this page (in particular) very carefully. Instructions You need to understand how to send your assignments (deliverables) Instructor: to your instructor. The tabs (bottom of each...

Directions for Problem #4 in Excel The goal of this problem is to have you: Use Excel to generate the future value of each of the various investments made over the course of the 25 years. Use Excel...

Fixing the payment system at Alvalade XXI: a case on IT project risk management Ramon O'Callaghan Tilburg University, The Netherlands Correspondence: AO'Callaghan, School of Economics and Business...

1 2.3 Definition of a Discrete Probability Function Definition: Let S be a discrete sample space from some experiment. A function P, defined on all events in S, is said to be a probability function...

Unit Information PEN593 Energy Economics Teaching Period: S2 2022 This guide should be used in conjunction with the Handbook as the official source of information about this unit. Refer to myMurdoch...

Washington and Lee Law Review Volume 72 Issue 3 Cybersurveillance in the Post-Snowden Age Article 3 Summer 5-1-2015 Government-Operated Drones and Data Retention Gregory S. McNeal Pepperdine...

Managerial Decision Making Six Decision Stages in Chapter 5 I. Identify and Diagnose the Problem Consider the following questions when identifying and diagnosing the problem: Is there a difference...

Teaching case Fixing the payment system at Alvalade XXI: a case on IT project risk management Ramon 0'Callaghan Tilbarg Lhiversily, The Netherlands Correspondence: Mo'Cellaghan, School of Eeonomics...

Heat transfer in a falling non-Newtonian film, repeat Problem 12B.4 for a polymeric fluid that is reasonably well described by the power law model of Eq. 8.3-3.

Centre Company provided the following listing of the current years post-closing account balances. Centre reported net income of $ 3,200 and declared dividends amounting to $ 600. Unrealized losses on...

Which of the following statements about bond breaking is not true? Mutiple Choice Homolysis generates uncharged teactive intermediates with unpaired electroms. Homolysa require energy but heterolysis...

SIMAD UNIVERSITY Class: BACC25 Subject: Islamic Accounting Instructions: a) Follow The Instructions. Midterm Exam Instructor: All Ibrahim Date: 6-4-2022 b) You Have 1.5 Hrs. To Complete This Test. c)...

How the cultural differences are refl ected in varying approaches to human resources.

Understand the purpose and importance of human resource planning (HRP) in an uncertain world.

Appreciate the effect of stress on employees and how it can be alleviated.