Question: DATA ANALYTICS Choose one dataset from the classification category (there are a total of 262 sets) of the UCI Machine Learning Repository. If the data

DATA ANALYTICS

Choose one dataset from the classification category (there are a total of 262 sets) of the UCI Machine Learning Repository. If the data comes in one set, partition the data into training and testing sets, (e.g. 33% for testing, note that in our class example, the mushroom datasets were created using the original mushroom dataset, which was partitioned into a training dataset with 7423 records and a testing dataset with 701 records), and follow the class example to convert the data to .csv and .arff file formats. Then use Weka to measure the performance of (1) Decision Tree, (2) K-NN. Use the training set to train your classifier and the test set to evaluate its accuracy. Please use snapshots of weka to report your results, and write a summary to show how you configured each method (e.g. in K-NN, what is the K and whether weight is considered or not; in decision tree, did you allow priming? what is the minimum instances number in the leaf nodes? Which algorithm is faster? Why? etc). For each learning algorithm, copy the performance data from Weka and add it into your summary, use tables to compare the results (e.g., which one gives better accuracy). Report any problems you encountered. Choose one dataset from the classification category (there are a total of 262 sets) of the UCI Machine Learning Repository. If the data comes in one set, partition the data into training and testing sets, (e.g. 33% for testing, note that in our class example, the mushroom datasets were created using the original mushroom dataset, which was partitioned into a training dataset with 7423 records and a testing dataset with 701 records), and follow the class example to convert the data to .csv and .arff file formats. Then use Weka to measure the performance of (1) Decision Tree, (2) K-NN. Use the training set to train your classifier and the test set to evaluate its accuracy. Please use snapshots of weka to report your results, and write a summary to show how you configured each method (e.g. in K-NN, what is the K and whether weight is considered or not; in decision tree, did you allow priming? what is the minimum instances number in the leaf nodes? Which algorithm is faster? Why? etc). For each learning algorithm, copy the performance data from Weka and add it into your summary, use tables to compare the results (e.g., which one gives better accuracy). Report any problems you encountered

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

You can use any software to plot and/or to calculate values/data, but if you do, provide (copy/paste) here the code. Data sets relevant for this HW can be found at the UCI Machine Learning...

I want test and train accuraciies in one valu Task 2: Perceptron for binary classification. Perceptron is a supervised learning algorithm for classification or regression. In supervised learning, you...

Task 2: Perceptron for binary classification. Perceptron is a supervised learning algorithm for classification or regression. In supervised learning, you are given a data set of pairs, where the...

Fundamentals of Data Science Assignment 1 Objective In this assignment, you will implement a predictive modeling approach based on the decision tree. Detailed Requirement We have introduced a...

Implement the ID 3 decision tree classification algorithm and apply it to the breast cancer dataset ( breast - cancer.arff ) from UCI Machine Learning Repository. Some of the features contain the...

Please fit regression models using dataset from UCI Machine Learning Repository: Auto MPG Data Set Links to an external site. you can access data: auto-mpg.names (Column names in order mpg (1) to car...

I NEED Test And Train accuracies in ONE Variable , TO show just one line in run screen Task 2: Perceptron for binary classification. Perceptron is a supervised learning algorithm for classification...

Introduction As a key component of the Predictive Modeling / Analytics course ( MGSC 5 1 2 5 ) , students are tasked with completing a project that not only demonstrates their understanding of the...

Classification Algorithms Implement in Python from the following 4 classifiers (your choice): {the Decision Tree, kNN, SVM, Backpropagation NN} Classifiers using the Heart Disease data set from the...

Jupiter Notebook We have covered some of the limitations of single layer neural networks in class, but they are still powerful learning systems that provide a good way to begin learning about how to...

Arla Foods and the Mohammed Cartoon Controversy (Case #3, Notes) 1. How do you anticipate the incident will affect Arla's brand image? Specifically, in Islamic countries versus the western world? 2....

Baker Corporation's accumulated depreciation - equipment account balance was $58,000 as of 12/31/16 and $60,000 as of 12/31/17. Equipment with a cost of $30,000 and accumulated depreciation of...

Currency values affect the amount of profit a company can earn from its international subsidiaries. Profits from overseas subsidiaries are normally integrated back into the parent company s financial...

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

What are the Variable columns settings available in the Mining Models Tab?

What does the Mining Content Viewer in Visual Studio show in terms of Probabilities?

How are continuous variables normally handled in Decision Tree Algorithms?