Question: Use the train.csv file to complete this case study which contains home sales data for Ames, Iowa in 2009. You will be using this dataset
Use the train.csv file to complete this case study which contains home sales data for Ames, Iowa in 2009. You will be using this dataset as you learn multiple regression (and later, regularization) to predict home sales prices from home features (predictors) in a competition against other MGSC 291 students. To find the best model for prediction, you need to explore and understand the data available to you. This purpose of this case is to begin this exploration in preparation for using the data later for predictions.
data set file
https://drive.google.com/file/d/1NjWzEM-vURReFKqQ22cS5SSYailgaR00/view?usp=sharing
# Q0 # Read in the train.csv file and name your object train.
###### Don't forget to use the strings = T argument in read.csv().
# Q1 # Use the nrow() function to find
###### how many homes are available in this dataset.
# Q2 # Use the ncol() function to find
###### how many variables were recorded for each home.
# Q3 # Use the mean() function to find the average sales price
###### for homes in this data. Hint: Use names(train)
###### to get the exact spelling of the columns in this dataset.
# Q4 # Later, you will model sales price by features
###### and if you model the sales price by month sold,
###### month should be a factor (a categorical variable).
###### Use the is.factor() function on the month sold column
###### to determine if month is a factor.
# Q5 # Use the sum() function on a logical vector
###### to count how many homes have a sales price of $181,000.
# Q6 # There are 4 rating categories for kitchen quality:
###### fair, average, good, and excellent.
###### Use the sum() function on a logical vector to count
###### how many homes have kitchen quality rated as "Excellent"?
###### Hint: use the summary() function on the kitchen quality column
###### to see how this quality category is spelled.
# Q7 # Use the sum() function on a logical vector to count
###### how many homes sold for more than $350,000
###### and have "Excellent" as their kitchen quality.
# Q8 # Use the mean() function to find the average sales price
###### for a home with kitchen quality rating of "Excellent".
# Q9 # Use the mean() function to find the average sales price
###### for a home with kitchen quality rating of "Excellent"
###### or has a garage for more than 2 cars.
# Q10 # Use the which() function on a logical vector to find
###### the observations number(s) for a home with a
###### kitchen quality rating of "Excellent"
###### and sales price of at most $195,000.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
