Question: 9 : 5 6 . II 5 6 : DATA _ 5 1 4 _ Unit _ 3 _ F 2 4 Task 3 :

9:56
. II 56 :
DATA_514_Unit_3_F24
Task 3:
Based on the information from the required reading assignments from Kuhn & Johnson, as well as based on what you've learned from Task 2, perform the following:
a) Prepare Grant data for Tasks 3b) and 3c) experiments:
To prepare Grant data for these experiments, you need file unimelb_training.csv and script CreateGrantData.R:
save unimelb_training.csv in your RStudio working directory (I have uploaded the file to Blackboard, but you could also find it on GitHub).
install the AppliedPredictiveModeling package and run the scriptLocation() command to locate CreateGrantData.R; then update (see below) and run this script in your RStudio environment. ?11 Since the script is old and since R environment has changed, you need to make the following updates to the script:
add stringsAsFactors = TRUE to the read.csv() command,
add options(expressions =15000) near the beginning of the script. ?12
?11 In my experiments with this script, parallel processing did not significantly decrease the running time, so to turn it off, you may change one line at the beginning of the script: from "cores -3" to "cores -1". However, if you want to run it in parallel, you need to use doParallel() instead of doMC().
?12 Extra credit is available for a good explanation of why these options are needed to run the script under R 4.x.x Page 19 of 20
page 20
b) Using Grant data perform LDA experiments; build and test an LDA classification model.
c) Using Grant data perform Partial Least Squares Discriminant Analysis (PLS-DA) experiments: using the function identify a PLS-DA model with optimal number of PLS components, and then test this classification model.
Important instructions for your report:
For Tasks 2 and 3 experiments, provide description of all steps of your experiments, their results (including confusion matrix and at least the basic performance metrics: accuracy, sensitivity, and specificity; use the confusionMatrix( function from the caret package), and discussion of the results. Include - within your narrative for each step - the R code used at this step, as well as printouts of its most important results. The R code included at each step must be complete, that is, when copied from your report and executed, it has to work.
Note: Set seed to 100 before any command that uses the random number generator (RNG), so your results are the same as expected.
Page limit: max 12 pages (plus references).
Very important: Make sure that you follow all report requirements as specified in Syllabus.
9 : 5 6 . II 5 6 : DATA _ 5 1 4 _ Unit _ 3 _ F 2

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!