Question: Please HELP ! This is a python programming question: Please read the instructions well. Give your own answer. Put a screenshot of the code you

Please HELP !

This is a python programming question:

Please read the instructions well. Give your own answer. Put a screenshot of the code you made. The assignment needs to be understood. Because of this, you need to write a good description and a working code.

********************************************************************************

This task will require you to use Python to implement and interpret some classification ideas.

The Breast Cancer Wisconsin dataset will be used in this assignment. See the following webpage for further details on this dataset:

There are 569 participants in the study that have either 'Malignant' or 'Benign' breast lesions. A digitized picture of a fine needle aspirate (FNA) of the breast mass yields 30 characteristics.

Your goal will be to classify the data in order to anticipate the diagnosis of each sample based on its characteristics (i.e. binary classification). Two well-known classifiers will be evaluated for classification performance: Support vector machines and classifiers (SVM) and bayes classifier.

-> First, load cancer dataset. This can be done with the following code snippet:

from sklearn.datasets import load_breast_cancer data = load_breast_cancer() X = data.data # Input features y = data.target # Class label (0: Malignant, 1: Benign)

-> Then, separate your data into 70% training and 30% test set using train_test_split:

from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split( ??? )

-> Scikit-learn library has built-in methods for Naive Bayes and SVM classification: from sklearn.naive_bayes import GaussianNB from sklearn.svm import SVC svm_cl = SVC(C=1, gamma='scale', probability=True) bayes_cl = GaussianNB(var_smoothing=1e-7)

-> You need to modify the following code piece to train and test your classifiers: CL.fit(X_train, y_train) # CL is the classifier model to train y_pred = CL.predict(X_test) # Cancer prediction on the test set # You need the probability of being cancer for ROC plot and AUC computation y_proba = CL.predict_proba(X_test)[:,1]

Train your classifier with different parameter settings: SVM: Try at least 3 different values for C Bayes: Try at least 3 different values for var_smoothing

-> Given the trained classifiers, the performance of the model could be tested using the following metrics, accuracy_score, roc_curve, auc: from sklearn.metrics import accuracy_score, roc_curve, auc # The prediction accuracy print ( "Prediction accuracy : %.2f" % accuracy_score( ??? )) # Receiver Operating Cracateristic Curve (ROC) fpr, tpr, thr = roc_curve( ??? ) plt.plot(fpr, tpr) # Area Under ROC curve (AUC) print ( "AUC score : %.2f" % auc( ??? ))

You can use these scores to measure the efficacy of a particular classifier. Your output should contain the following:

Classification accuracy of the two classifiers (Bayes and SVM) with different parameter settings. ROC curves and AUC scores of the two classifiers (Bayes and SVM) with different parameter settings.

-> Given this output, respond to the following questions: What is the highest classification accuracy you achieve? How does the classifier parameters affect the classification performance? Which classifier performs better? Any thoughts why? What does the AUC score represent? Why is it an important metric?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!