Question: Questions 1 1 - 2 0 are based on the files HW 7 _ training.csv and HW 7 _ scoring.csv Your goal is to predict
Questions are based on the files HWtraining.csv and HWscoring.csv Your goal is to predict the demand for product consumption. You will use linear regression to predict annual heating oil consumption and help a company to secure a certain amount of oil based on your predictions. You will build a model in the training dataset, and apply the results of your model in the scoring dataset.
Question
Upload the Training and Scoring datasets to RapidMiner. Note: Insulation is measured on a scale ten is the highest, outdoor temperature measured in F NumOccupants number of total residents per home, HomeAge age in years, HomeSize size in sq ft HeatingOilUsed the number of units of oil purchased in a recent month
Run the process, and explore the results. In the Statistics view, check the ranges for each attribute in both training and scoring datasets. Note that the ranges for HomeSize are different in the training and scoring datasets. Make the ranges identical by applying a filter in the Scoring dataset. Use HomeSize and HomeSize
Run the results. How many records are now in the Scoring dataset?
points
Question
In the Training dataset, add operator Set Role, and assign the role of label to attribute HeatingOilUsed. How many regular attributes are now in the dataset?
Question
In the Training dataset, add operator Linear Regression, keep the defaults, run the process. Explore the Linear Regression Coefficients. Which attribute has the heaviest weight highest coefficient
InsulationRating
OutdoorTemp
HomeAge
HomeSize
NumOccupants
points
Question
In the Linear Regression Coefficients table, explore the significance of attributes. Which attribute is not significant and has been automatically removed from the model?
InsulationRating
OutdoorTemp
HomeAge
HomeSize
NumOccupants
Question
In the Linear Regression Coefficients table, explore the Intercept. What is the coefficient of the Intercept?
points
Question
Based on the coefficients in the Linear Regression Coefficients table, create a Regression formula and calculate oil consumption for the house with the following attributes: InsulationRating OutdoorTemp HomeAge HomeSize NumOccupants What is the result of your calculation? Round up the number if necessary.
points
Question
Based on a Regression formula created in Question calculate oil consumption for the house with the following attributes: InsulationRating OutdoorTemp HomeAge HomeSize NumOccupants What is the result of your calculation? Round up the number if necessary.
Question
Apply the Linear Regression model to the Scoring dataset. Run the process and explore the results. What is the average predicted oil consumption in the scoring dataset? Round up the answer if necessary.
points
Question
Add operator Aggregate Note: connect the lab port of Apply Model and exa port of Aggregate In the Aggregate parameters, edit the list of aggregation attributes, and calculate the sum and median of the PredictionHeatingOilUsed Based on your calculations, how many units of oil will be required to satisfy the demand? Round up the answer if necessary.
points
Question
Based on the aggregate results, what is the median value for the PredictionHeatingOilUsed Round up the answer if necessary.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
