Question: Help with Exercise 2 Exercise 1 for Reference: Exercise 1: a) Use the Machine Learning algorithms: k-NN, and Nave Bayes to classify multiphase flow patterns,

Machine Learning algorithms: k-NN, and Nave Bayes to classify multiphase flow patterns,

Help with Exercise 2

Exercise 1 for Reference:

Exercise 1: a) Use the Machine Learning algorithms: k-NN, and Nave Bayes to classify multiphase flow patterns, using the database BDOShohamIML.csv and evaluate the performance. b) Apply parameters optimization to (a) and evaluate the performance. c) Explain the Confusion Matrix and metrics obtained in (a) y (b), that is, before and after parameters optimization.

Code for Exercise 1:

using the database BDOShohamIML.csv and evaluate the performance. b) Apply parameters optimization

to (a) and evaluate the performance. c) Explain the Confusion Matrix and

In [14]: import numpy as np import pandas as pd from sklearn. neighbors import KNeighborsClassifier from sklearn. metrics import accuracy_score from sklearn. model_selection import GridSearchCV from sklearn. naive_bayes import GaussianNB from sklearn. metrics import confusion_matrix data = pd. read_csv( 'Data_Glioblastoma5Patients_SC. csv' ) print ( ' Shape: ' , data . shape) data . head ( ) Shape: (430, 5949) Out [14 ] : A2M AAAS AAK1 AAMP AARS AARSD1 AASDH AASDHPPT AA 0 -3.80147 -3.889900 -3.985616 2.651558 2.170748 -2.550822 4.807330 3.961170 -0.1926 1 -3.80147 -3.889900 -3.158708 2.358992 -6.041792 -0.056092 3.606735 -2.632250 2.2493 2 -3.80147 -3.889900 1.733125 -5.820241 -6.041792 -0.576957 -2.473517 -4.354127 0.063 3 -3.80147 -3.889900 -1.665669 3.514271 -6.041792 -3.699171 4.509461 -4.354127 2.985 4 -3.80147 3.742495 -2.166992 -5.820241 2.094729 4.021873 5.535007 4.019633 2.5603 5 rows x 5949 columns In [22]: #Knn Code # a ) clf1 = KNeighborsClassifier(n_neighbors=3) . fit(data . iloc[ : , : -1] . values, data. il oc[ : , -1: ] . values . ravel( )) y_pred = clf1. predict(data. iloc[ : , : -1] . values) print( ' Accuracy score: ', accuracy_score (data. iloc[ : , -1: ] . values, y_pred) ) confusion_matrix(data. iloc[ : , -1: ] . values, y_pred) Accuracy score: 0. 9506607929515418 Out [22]: array([[ 973, 0 , 26, 0 , 34] , 121, 1, 3, 0 , 01, 1, 550, 41, 0, 2], 67. 9, 38, 2768, 4, 19]. 0, 0, 0 , 2, 136, 2] , 20, 2, 1 , 8 , 847]], dtype=int64) In [15 ]: # b) leaf_size = list(range(1, 10) ) n_neighbors = list(range(1, 5) )In [ ]: hyperparameters = dict(leaf_size=leaf_size, n_neighbors=n_neighbors) clf2 = KNeighborsClassifier() clf3 = GridSearchCV(clf2, hyperparameters, cv=5) best_model = clf3. fit(data . iloc[ : , : -1] . values, data. iloc[ :, -1: ] . values. ravel( )) print( 'Best leaf_size: ', best_model. best_estimator_. get_params( ) ['leaf_size' ]) print( 'Best n_neighbors: ', best_model. best_estimator_. get_params( ) ['n_neighbor s' ] ) In [ ]: clf4 = KNeighborsClassifier(n_neighbors=3, leaf_size=1) . fit(data. iloc[ :, : -1] . va lues, data . iloc[ : , -1: ]. values . ravel( ) ) y_pred = clf1. predict(data. iloc[ :, : -1] . values) print( ' Accuracy score: ' , accuracy_score(data. iloc[:, -1: ] . values, y_pred) ) print ( ' confusion Matrix: ') confusion_matrix (data . iloc [ : , -1: ]. values, y_pred) In [26]: #Naive Bayes Code # a ) clf1 = GaussianNB( ) . fit(data. iloc[ :, : -1] . values, data. iloc[ : , -1: ] . values. ravel ( ) ) y_pred = clf1. predict(data. iloc[ : , : -1] . values) print( 'Accuracy score: ', accuracy_score (data. iloc[ : , -1: ]. values, y_pred) ) confusion_matrix(data . iloc[ : , -1: ] . values, y_pred) Accuracy score: 0. 6754185022026432 Out [26]: array ([[ 879, 0, 0, 143, 1, 10], 0 , 121, 0 , 4, 0 , 0], 1, 3, 471, 115, 4, 0], [ 124, 53, 192, 2228, 240 68] , 0 , 0 , 0 , 11, 129 0] , [ 307, 0 , 9 , 488, 69, 5]], dtype=int64) In [27 ]: # b) hyperparameters = {'var_smoothing' : np. logspace(0, -9, num=100) } clf2 = GaussianNB( ) clf3 = GridSearchCV(clf2, hyperparameters, cv=5) best_model = clf3. fit(data. iloc[ : , : -1] . values, data. iloc[ :, -1: ] . values. ravel( ) ) print( 'Best var_smoothing: ', best_model. best_estimator_. get_params ( ) ['var_smoo thing ' ]) Best var_smoothing: 1. 873817422860387e-09 In [ ]:In [20]: clf4 = GaussianNB(var_smoothing= 1. 2328467394420635e-09) . fit(data. iloc[ : , :-1]. values, data. iloc[ :, -1: ] . values . ravel( )) y_pred = clf4. predict(data. iloc[ : , : -1] . values) print( 'Accuracy score: ', accuracy_score(data. iloc[ : , -1: ] . values, y_pred) ) print( ' confusion Matrix: ') confusion_matrix(data . iloc[ : , -1: ]. values, y_pred) Accuracy score: 0. 6755947136563877 confusion Matrix: Out [20]: array ([[ 879, 0 0 , 143, 1, 10], 0 , 121, 0 , 4 0 , 01, 1, 2 , 471, 116 , 4 , 01, 124, 52, 192, 2229, 240, 68], 0 , 0 , 0 , 11, 129 01, 307, 9 , 488 , 69 5]], dtype=int64) C) Explain the Confusion Matrix and Metrics before a & b The accuracy score associated with the confusion matrix with the K-nn before and parameters optizimation was identical, high 0.95. The accuracy score associated with the confusion matrix with Naive Bayes before and after parameters optimization was very close, both around .68. These data seem to indicate that a higher accuracy of prediction is obtained through the K-nn prediction method

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Homework 1: a) Use the Machine Learning algorithms: k-NN, and Nave Bayes to classify multiphase flow patterns, using the database BDOShohamIML.csv and evaluate the performance. b) Apply parameters...

summaries the next paragraph in one prepare in word document 1. Introduction Undoubtedly, the world is shrinking into a small village owing to the tangible influence of social media. It connects...

Algorithms in Artificial Intelligence (or, the old name: Introduction to Algorithmic Decision Making) Part 1 Based on slides by David Sarne and Lirong Xia Course Tentative Schedule Introduction...

Using the Annual Report of your selected company answer the following questions in the Discussion: What is the value of the company's inventory at year end? What was the amount of cost of goods sold...

Jones & Bartlett Learning, LLC. NOT FOR RESALE OR DISTRIBUTION CHAPTER Hot Spot Analysis 10 LEARNING OBJECTIVES C A R R Provide a working definition of a \"hot spot.\" , Be able to explain different...

1 CLO Business Decision Making Project Part 1 Sales Decline at McDonalds Inc. LaShondra Oglen QNT/275 1/25/16 Professor Robert Robinson 2 CLO Business Decision Making Project Sales Decline at...

PLEASE ANSWER THESE QUESTIONS DONT COPY FROM INTERNET SOURCE PLEASE IT SHOWS PLAG c)What KEY FACTORS may determine the success for Gap Inc.? Explain THREE in your answer. d)Recommendations Make...

IfyouhaveplayedaSimulationcalledProBankerIneedhelpansweringthesequestionsassoonaspossible from the pro bankerassignment attachment..please use spreadsheet and players manual for reference. Need...

I need with these problems/exercises listed below (chapters attached). You must use the chapters to fill out the guidance report which is attached. Using the data for March/April Column. Chapter...

In Problem 9.1, suppose the projections given for price, quantity, variable costs, and fixed costs are all accurate to within 10 percent. Calculate the best-case and worst case NPV figures. In...

Pin Company owns 40,000 of 50,000 outstanding shares of Sum Company, and during 2011, it recognizes income from Sum as follows: Share of Sum net income ($500,000 * 80%) .... $ 400,000 Patent...

What are the net proceeds to siemens from erach of these syndicated loan propoials?

11.15 Management has stated that it will tolerate one stockout per year. The forecast of annual demand for a particular SKU is 100,000 units, and it is ordered in quantities of 10,000 units. The lead...

8. What are the costs of collecting the information?

6. How great is the time-lag between when an event occurs until we get information about it?

1. Build trust and share information with others.