Question: CODE IN PYTHON DONT USE ANY INBUILT LIBRARY IMPLEMENT EVERYTHINF FROM SCRATCH,Note you are not allowed to use libraries which can take data, fit the

CODE IN PYTHON DONT USE ANY INBUILT LIBRARY IMPLEMENT EVERYTHINF FROM SCRATCH,Note you are not allowed to use libraries which can take data, fit the

model, predict the labels and give final evaluation metrics.

Use MNIST

dataset for this question you can get it from google and select two digits

- 0

and

1 .

Label them as

- 1

and

1 .

In this exercise you will be implementing AdaBoost.M

1 .

Perform following

tasks.

Divide the train set into train and val set. Keep

1000

samples from each

class for val. Note val should be used to evaluate the performance of the

classifier. Must not be used in obtaining PCA matrix.

Apply PCA and reduce the dimension to p

= 5 .

You can use the train set

of the two classes to obtain PCA matrix. For the remaining parts, use the

reduced dimension dataset.

Now learn a decision tree using the train set. You need to grow a deci

-

sion stump. For each dimension, find the unique values and sort them

in ascending order. The splits to be evaluated will be midpoint of two

consecutive unique values. Find the best split by minimizing weighted

miss

-

classification error. Denote this as h

1 (

) .

Note as we are dealing

with real numbers, each value may be unique. So just sorting them and

taking midpoint of consecutive values may also result in similar tree.

Compute alpha

1

and update weights.

Now build another tree h

2 (

)

using the train set but with updated weights.

Compute alpha

2

and update weights. Similarly grow

300

such stumps.

After every iteration find the accuracy on val set and report. You should

show a plot of accuracy on val set vs

.

number of trees. Use the tree that

gives highest accuracy and evaluate that tree on test set. Report test

accuracy.

[2]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

1. Problem Statement A Bank employee need your help to manage the details of customer. As few of the Bank customers holds the debit & credit cards, they want to maintain all the stuff. Let us help...

A Bank employee need your help to manage the details of customer. As few of the Bank customers holds the debit & credit cards, they want to maintain all the stuff. Let us help the employee in one...

Jupiter Notebook We have covered some of the limitations of single layer neural networks in class, but they are still powerful learning systems that provide a good way to begin learning about how to...

Information Technology 1. Introduction Aim In this assignment, you'll build your own simple 1/0 library called myio, vaguely similar to stdio.h, by writing your own functions for performing various...

Given code: utils.c (for reference DON'T Modify), utils.h (DON't Modify) and main_template.c (Write Code HERE) --> UTILS.C [DO NOT MODIFY] pasting image cause Chegg character limit >:( --> UTILS.h...

Introduction Note: Circular Buffers are described in Section 5.2.4 of our textbook (p. 211). Please read that section before proceeding. An interesting, relatively straightforward data structure is...

Jupyter Notebook Now that we have tried our hand at some single-layer nets, let's see how they stack up compared to multi-layer nets. :) We will be exploring the basic concepts of learning non-linear...

Association rule mining is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in...

Assignment 3: Nave Bayes Classifier for Spam Email Prediction Procedure 1) Follows steps in the given Jupyter Notebook file, named Spam Classification Using Naive Bayes.ipynb, to go through text data...

Confirming Pages C H A P T E R 19 Analyzing Information and Writing Reports Chapter Outline Using Your Time Efficiently Analyzing Data and Information for Reports Identifying the Source of the Data...

Pine et al. (1983) followed patients with intraabdominal sepsis (blood poisoning) severe enough to warrant surgery to determine the incidence of organ failure or death (form sepsis). Those outcomes...

Which of the following is/are true about normalisation? Select one or more: Select one or more: Compared to 3NF, BCNF is a less restrictive normal form Denormalisation is a design process that...

Regarding mortgage commitments and mortgage pre-approvals, which of the following is NOT correct. Select one answer. A mortgage commitment will commit the funds with certainty to the borrower, A...

What is 4 (3) 6.5 B) 0.2 c) 1.0 (D) 2.8 n-l rounded to the nearest te