Question: Section B: Programming Assignment Please solve the following problem by coding Python programs Data: You will be working on MNIST data, a dataset of thousands

Section B: Programming Assignment Please solve the following problem by coding

Section B: Programming Assignment Please solve the following problem by coding Python programs Data: You will be working on MNIST data, a dataset of thousands of images of handwritten digits. You can download the dataset here - https://www.kaggle.com/c/digit-recognizer/data. "The data files train.csv and testcsv contain gray-scale images of hand-drawn digits, from zero through nine. Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker This pixel-value is an integer between 0 and 255, inclusive. The training data set, (train.csv), has 785 columns. The first column, called "label", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image." You only need to download the train.csv file. test.csv is not required. The train.csv file contains 42k samples of images. To reduce time of running the program, you will only work with 1,000 randomly selected samples out of these, although make sure you have equal number of samples belonging to each label (i.e 100 samples of label '0', 100 samples of label 1' and so on). Problem Statement: 1. Perform Naive Bayes (NB) classification and K Nearest Neighbor (KNN) classification on the above data 2. Calculate the accuracy obtained using NB and KNN methods; 3. Perform cross-validation using NB and KNN methods, and compare the results with problem 2. Task 1. You can implement the method from sklearn to implement classification. You will need to sample 100 instances out of each label(1,000 instances in total). Task 2 Apply NB and KNN methods on the whole dataset (described in Task 1) as training dataset, calculate the training accuracy for each method Task 3 Apply k-fold cross validation using the sampled dataset generated in task 1. Perform the k-fold cross validation experiment using the following values: k-2, and k-4 Task 4 Compared with the results you get in homework 2 (Decision Tree and Multiple Layer Perceptron), what is your observation regarding their results? Deliverables: I. 2. Python source codes in a zipped file; Brief report including all your results and observations; Section B: Programming Assignment Please solve the following problem by coding Python programs Data: You will be working on MNIST data, a dataset of thousands of images of handwritten digits. You can download the dataset here - https://www.kaggle.com/c/digit-recognizer/data. "The data files train.csv and testcsv contain gray-scale images of hand-drawn digits, from zero through nine. Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker This pixel-value is an integer between 0 and 255, inclusive. The training data set, (train.csv), has 785 columns. The first column, called "label", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image." You only need to download the train.csv file. test.csv is not required. The train.csv file contains 42k samples of images. To reduce time of running the program, you will only work with 1,000 randomly selected samples out of these, although make sure you have equal number of samples belonging to each label (i.e 100 samples of label '0', 100 samples of label 1' and so on). Problem Statement: 1. Perform Naive Bayes (NB) classification and K Nearest Neighbor (KNN) classification on the above data 2. Calculate the accuracy obtained using NB and KNN methods; 3. Perform cross-validation using NB and KNN methods, and compare the results with problem 2. Task 1. You can implement the method from sklearn to implement classification. You will need to sample 100 instances out of each label(1,000 instances in total). Task 2 Apply NB and KNN methods on the whole dataset (described in Task 1) as training dataset, calculate the training accuracy for each method Task 3 Apply k-fold cross validation using the sampled dataset generated in task 1. Perform the k-fold cross validation experiment using the following values: k-2, and k-4 Task 4 Compared with the results you get in homework 2 (Decision Tree and Multiple Layer Perceptron), what is your observation regarding their results? Deliverables: I. 2. Python source codes in a zipped file; Brief report including all your results and observations

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Code: Nearest neighbor for handwritten digit recognition In this notebook we will build a classifier that takes an image of a handwritten digit and outputs a label 0-9. We will look at a particularly...

Please use Python language. use Google Colab! Problem 1) Classifying handwritten digits: Using the MNIST dataset, we will build a simple threshold-based classifier that classifies 0 digits from...

hello please help ill upbote (in phyton) here is the previous question, i dont know if its relevant to questiln 3 but there is no samples or data other than what i posted Problem 3) MNIST dataset -...

RMIT UNIVERSITY Programming Fundamentals (COSC2531) Assignment 2 Individual assignment (no group work). Submit online via Canvas/Assignments/Assignment 2. Marks are awarded per rubric (please see the...

Let A, B be sets. Define: (a) the Cartesian product (A B) (b) the set of relations R between A and B (c) the identity relation A on the set A [3 marks] Suppose S, T are relations between A and B, and...

CSCI 5525 MACHINE LEARNING, Fall 2017, Prof Schrater Homework 1 September 27, 2017 1. For data (x, y) with a joint distribution p(x, y) = p(y|x)p(x), the expected loss of a function f (x) to model y...

The new line character is utilized solely as the last person in each message. On association with the server, a client can possibly (I) question the situation with a client by sending the client's...

FORUM: QUALITATIVE SOCIAL RESEARCH SOZIALFORSCHUNG Volume 2, No. 3, Art. 22 September 2001 Qualitative Data Analysis: Common Phases, Strategic Differences Ian Baptiste Key words: Abstract: This paper...

trying to make a valentine card? dont know how to do the question or what I need is to make a valentine day card. 1. needs 4 types of graphics 2. greeting of "Happy Valentines!" 3. need one animation...

Satellite Data Retrieval, Reference Frames, Numerical and Analytical Orbital Propagation SSD Individual Assignment RMIT University Figure 1: Example errors between a ground truth ephemeris and...

Over a four-year period ending on December 31, 2021, Ms. Brenda Breau had the following financial data: Non-farming business income (loss) Farming business income (loss) Taxable (grossed up)...

Determine the material requirements plans for parts N and V and subassembly J. as described in Solved Problem 2 (see figure there) for each of the following situations: a. Assume that there are...

Which of the following is not one of the most relevant sources of civil liabilities for auditors charged with failing to adhere to the requirements of the laws in carrying out professional...

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

Explain the purpose of the Project Charter and its relationship to Management Approval for a Project.

What is Change Control and how does it operate?

How do Data Requirements relate to Functional Requirements?