Question: Bagging Algorithms The base type bagging machine learning algorithms that will be examined in this assignment are: Bagged CART, Random Forest Stacking Algorithms The base

Bagging Algorithms

The base type bagging machine learning algorithms that will be examined in this assignment are:

Bagged CART,
Random Forest

Stacking Algorithms

The base type stacking machine learning algorithms that will be examined in this assignment are

Classification and Regression Trees (CART),

K-Nearest Neighbors (KNN),
Nave Bayes (NB)

Main-Question: How will you know how good your ensemble classifier is? Under which conditions ensemble learning is useful?

1^st Task: Data Set Selection and Visualisation

You need to select a data set of your own choice (i.e. you may use a dataset already used before in the lab, or from the literature review) for the purposes of building training and validating the above type of classifiers (Bagging, Stacking). With the aid of R package visualise and justify the properties of the selected data set.

2^nd Task: Formation of Training and Test Sets

Assuming we have collected one large dataset of already-classied instances, you need to look into methods of forming training and test sets from this single dataset in R as described below.

Repeated k-fold Cross-Validation

The process of splitting the data into k-folds can be repeated a number of times; this is called Repeated k-fold Cross-Validation (repeatedcv). The final model accuracy is taken as the mean from the number of repeats.

3^rd Task: Build Train and Test a Bagging type Classifier

You need to construct, train and test a Bagging type classifier in R, based on Bagged CART and Random Forest base classifiers. Train and test the Bagging classifier using the training and test sets generated based on the method tried as part of the 2^nd Task.

4^thTask: Build Train and Test a Stacking type Classifier

You need to construct, train and test a Stacking type classifier in R, based on (CART, KNN, NB). Train and test your Stacking classifier using the training and test sets generated based on the method tried as part of the 2^nd Task.

5^thTask: Measure Performance

For each type of ensemble type classifier calculate and display the following performance related metrics in R. Critically comment on the importance of each metric for each type of ensemble type classifier. Use the library library(ROCR)

Confusion matrix
Precision vs. Recall
Accuracy
ROC(receiver operating characteristic curve)
RAUC (receiver under the curve area)
Training time
Testing time
Based on the above Metrics briefly discuss, how we can increase the reliability and consistency of the data classification task at hand.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Could you please explain the findings of the study? A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models Evangelia...

Python and most Python libraries are free to download or use, though many users use Python through a paid service. Paid services help IT organizations manage the risks associated with the use of...

Please help me make an Executive Summary. Explain what you will examine in the case study. Write an overview of the field you are researching. Make a thesis statement and sum up the results of your...

Lesson 12 Quiz (Show/Explain all Work) IST 230 Relations on Sets, Databases 1. Let A = {0, 1, 2, 3, 4, 5, 6, 7, 8} and B = {1, 2, 3, 4, 5, 6, 7, 8}. Now let R be a binary relation R from A to B such...

There are many methodologies for each multiplication and division problem. The method chosen may depend on the assessment design. For example, a timed multiplication test may heavily rely on...

14. The bubble sort is one of the fastest sorting algorithms available, but it is also one of the most complicated sorts to understand and implement. true/false 15. When wworking with small sets of...

14. The bubble sort is one of the fastest sorting algorithms available, but it is also one of the most complicated sorts to understand and implement. True/False 15 .When working with small sets of...

Proving recursive algorithms correct is best done with Proof by Induction. Which of the following statement ( s ) are always true? Group of answer choices Recursive algorithms cannot be turned into...

""" Linked List Implementation """ class ListNode: def __init__(self, data, link=None): self.data = data self.link = link class LinkedList: def __init__(self, L=None): self.head=None if L: for i in...

Which of the following best describes a difference between neural networks and genetic algorithms? Genetic algorithms are designed to process large amounts of information, while neural networks are...

The declaration, record, and payment dates in connection with a cash dividend of $187,500 on a corporation's common stock are July 10, August 9, and September 18. Journalize the entries required on...

Dewey Cheetham & Howe Accounting firm is considering the purchase of a $1,000 New Haven Municipal Bond. The stated coupon rate is 5%, paid semi-annually (twice a year). The bond will mature in 20...

Which of the following are open issues in the affective computing? Inheritance of deception in Affective Computing. Should destructive machines be given emotional capabilities? Both a and b . None of...

Scenario: A Multi-national Corporation called The Globe has created a subsidiary called Thirst in an under-developed country called non-potable waters. This hosting country suffers from a severe...

7. Define cultural space.

5. Define and give an example of cross-cultural differences in facial expressions, proxemics, gestures, eye contact, paralanguage, chronemics, and silence.

8. Describe how cultural spaces are formed.