Question: Classification Algorithms Implement in Python from the following 4 classifiers (your choice): {the Decision Tree, kNN, SVM, Backpropagation NN} Classifiers using the Heart Disease data

Classification Algorithms

Implement in Python from the following 4 classifiers (your choice): {the Decision Tree, kNN, SVM, Backpropagation NN} Classifiers using the Heart Disease data set from the University of California Irvine Machine Learning Data Repository at archive.ics.uci.edu/ml .

Data set: This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0).

The value functions for each attribute are described in the ML Data Repository. There are 13 input attributes and one output/decision attribute: heart disease present or absent. Partition data into training (learning model) and test sets. For tree classifier use the top-down greedy algorithms with either GINI or Information Gain/Entropy measures for node splitting. It would be more elegant (but not required) to avoid model overfitting using pessimistic error formula whether to prune leaves nodes or not to avoid model overfitting.

For SVM you can use either linear SVM (risking that both classification (training and generalization) error will be large), or preferably nonlinear SVM using e.g., polynomial, Gaussian radial, or sigmoid kernel. Of course, your output class attribute should be modified: instead of 1 for disease class use +1, and instead of 0 for non-disease class use -1. You can be inspired, but you are not allowed to use an existing code, in other words you write your own programs, but you can use standard or other language libraries, including libraries for linear algebra, matrices, and Lagrangian nonlinear optimization with constraints (excluding libraries/ software packages for data mining or machine learning with implemented complete algorithms). Please include both sources and sample outcome running of your programs. Compare performance of both classifiers, i.e., it is sufficient to provide both training accuracy and test/generalization accuracy for both your programs (of course, using the same training and test data). Based on that, reply which classifier seems be performing better for your programs and data. Comment: a more elegant would be to test, e.g., the confidence interval for the true accuracy (based on test accuracy) at (1 - ) confidence level, or the hypothesis that the performance difference for stochastic variable d = e1 - e2 (where e1 is misclassification error for the first classifier, and e2 is misclassification error for the second classifier) is statistically significant at (1 - ) confidence level.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

INSTRUCTIONS ---> Python There are three parts to this project in Python. Please read all sections of the instructions carefully. I. Perceptron Learning Algorithm II. Linear Regression III....

The total number of points for this assignment is 120 points. Please submit your assignment in a Word file. Use this assignment file as a template to enter and copy-paste your answers for your...

Python 2.7 only. Must write one program for SVM. Cannot use Sklearn packages for SVM must write own SVM. Thank you!! please follow instructions as stated in description.. No quick linear regression....

Python and most Python libraries are free to download or use, though many users use Python through a paid service. Paid services help IT organizations manage the risks associated with the use of...

Problem 4 Parameters to be tuned for XGBoost: 1. n_estimators 2. max_depth 3. lambda 4. learning_rate 5. missing 6. objective Parameters to be tuned for SVM: 1. kernel_type 2. gamma 3. C Parameters...

Algorithms in Artificial Intelligence (or, the old name: Introduction to Algorithmic Decision Making) Part 1 Based on slides by David Sarne and Lirong Xia Course Tentative Schedule Introduction...

1. TRAINING THE NAVE BAYES CLASSIFIER FOR MOVIE REVIEW CLASSIFICATION i). Implement in Python a Nave Bayes classier with bag-of-word features and add-1 smoothing. Note: Smoothing should be used for...

Project Title: Machine Learning Classification on [ Your Chosen Dataset ] Project Description: In this final project, students will apply their knowledge of machine learning classification methods to...

You are required to use the dataset contained within the file Groceries data.csv and then perform the following analysis by testing at least 2 classification algorithms and using Market basket...

Objectives This requires you to implement the Perceptron algorithm using the Python programming language. NOTE No credit will be given for implementing any other types of classification algorithms or...

Product R is normally sold for $55 per unit. A special price of $46 is offered for the export market. The variable production cost is $32 per unit. An additional export tariff of 15% of revenue must...

Items 1 through 8 are selected questions typically found in questionnaires used by auditors to obtain an understanding of internal control in the inventory and warehousing cycle. In using the...

Over / Under: Beta paid the utilities for the month when it was due but did not record the transaction. What effect does this have on assets, liabilities, and net income? Assets: Understated ,...

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

How are custom calculations developed that will refer back to columns in the Pivot Table on the same Excel worksheet?

What do the Length of Service and Length of Service Earnings Quotients indicate with reference to Female versus Male Wage and Job Progression in respect to Length of Service?

How do Excel Pivot Tables handle data from non OLAP databases?