Question: ASSESSMENT TASK 2 ( PROBLEM SOLVING ) in 2 0 2 3 T 3 Using aggregation functions for data analysis The provided zip file contains
ASSESSMENT TASK PROBLEM SOLVING in T
Using aggregation functions for data analysis
The provided zip file contains the data file ENBtxt and the R code AggWaFitR to use with the following tasks, include these in your R working directory.
Total Marks Weighting
Energy Appliances Dataset
The Dataset for this assignment is modified version of a subset of data used in Candanedo et al
The experimental data have been used to create models of energy use of appliances in a lowenergy house. The modified Dataset provides the energy use of Appliances denoted as Y
The Dataset comprises features variables which are denoted as X X X X and X
The details about these variables are given below:
X: Temperature in living room area Celsius degrees
X: Humidity in living room area percentage
X: Temperature in office room Celsius degrees
X: Humidity in office room percentage
X: Pressure millimeter of mercury
Y: Appliances energy consumption Wh
For more information about the variables see Candanedo et al
Assignment tasks
T Understand the data
iii
iii
Download the txt file ENBtxt from CloudDeakin and save it to your R working directory. Assign the data to a matrix, eg using
the.data asmatrixreadtableENBtxt
The variable of interest is Y To investigate Y generate a subset of numrowuse the same setting for the following tasks as well with numerical data eg using:
mydata the.datasample:numsamples,numrow c:numcol
This would give you a new dataset with numrow rows and numcol columns. Values of numsample and numcol have to be determined from the data provided.
ivUse scatter plots and histograms to understand the relationship between each of the variables X X X X X
and your variable of interest Y ie catter plots of X YX YX Y and histograms of X X X
X X Y
T Transform the data
Choose any FOUR variables from X X X X X
Make appropriate transformations so that the values can be aggregated in order to predict
the variable of interest Y
Assign your transformed data along with your transformed variable of interest to an array
it should be numrow rows and columns Save it to a txt file titled "nametransformed.txt
write.tableyourdata,"nametransformed.txt
The following tasks are based on the saved transformed data.
T Build models and investigate the importance of each variable.
i Download the AggWaFit.R file to your working directory and load into the
R workspace using,
sourceAggWaFitR
ii Use the fitting functions to learn the parameters for
a A weighted arithmetic mean WAM
b Weighted power means WPM with p
c Weighted power means WPM with p
d An ordered weighted averaging function OWA
T Use your model for prediction.
Using your best fitting model from T ie WAM, WPM WPM or OWA, predict Y Appliances for the following inputs:
X X X X X
You should use the same preprocessing as in Task
Compare your prediction with the measured Y
T Summarise your data analysis in up to slides for a minute presentation
The slides should include the following content:
Correlations between the variables;
What kinds of data distributions you have identified in the raw data, use the histograms you have produced;
List and explain the transformations applied for the selected four variables and the variable of interest;
Explain the importance of the variables you have selected;
The best fitting model on your selected data; include two tables:
one with the error measures and correlation coefficients, and one summarizing the weightsparameters
and any other useful information learned for your data;
Your prediction result and comment on wheather you think it is reasonable;
Discuss the best conditions in terms of your chosen variables under which a low energy use of
appliances will occur.
Comment on the implications and limitations of the fitting model you used for prediction.
The slides should contain all necessary information to prove your findings. All the bold terms above must appear in slide titles. For the minute presentation, you may provide a link to YouTube or upload a mp video. Any content beyond minutes will not be graded.
SUBMISSION:
Submit to the SIT CloudDeakin Dropbox.
Your final submission must include the following TWO files:
The presentation slides with video, "nameslides" pdf covering all of the items in above
where name is replaced with your name you can use your surname or first name
a link to YouTube or uploading a mp file
The R code file that you have written to produce your results named "namecode.Rwhere name is replaced with your surname or first name; RMD file is not allowed
Your assignment will not be assessed if the code is missing, or the outputs of the code are
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
