Write a Python function to implement the k-Nearest Neighbors algorithm using Numpy 1. Write a Python...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Write a Python function to implement the "k-Nearest Neighbors" algorithm using Numpy 1. Write a Python function that uses the k-Nearest Neighbors (k-NN) algorithm to classify a given set of test data based on a given set of training data and labels. The function should be named knn_classifier and it should take four arguments: A. train_data: A 2D numpy array where each row is a separate data point in the training dataset. B. train_labels: A 1D numpy array where each element is the class label of the corresponding row in train_data. C. test_data: A 2D numpy array where each row is a separate data point in the test dataset. The function should predict and return class labels for these data points. D. k: The number of nearest neighbors to be considered. The knn_classifier function should return test_labels, a 1D numpy array where each element is the predicted class label of the corresponding row in test_data. Use the Euclidean distance to calculate the distance between data points. The class of each test point is determined by the class of k closest training points (in terms of Euclidean distance). If there's a tie in the voting process, break the tie by choosing the class with the closest representative among the k nearest neighbors. You may use the k_closest_points and closest_labels functions from the previous question in your implementation. You are not allowed to use any external libraries except for Numpy and Python's standard library. 2. Write a function accuracy_score that calculates and returns the accuracy of the knn_classifier function's predictions. The function should take two parameters: true_labels: The actual labels of the test data. predicted_labels: The labels predicted by the knn_classifier function. The accuracy score is the proportion of correct predictions over total predictions. def knn_classifier (train_data, train_labels, test_data, k): predicted labels = [] for test_point in test_data: # Get Labels of k closest points in train_data to the current test point # Find the most common Label among the k closest ones #return the predicted Labels for test data def accuracy_score (true_labels, predicted_labels): return accuracy # 15 training data points, each with 5 features train_data = np.array([ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25], [26, 27, 28, 29, 30], [31, 32, 33, 34, 35], [36, 37, 38, 39, 40], [41, 42, 43, 44, 45], [46, 47, 48, 49, 50], [51, 52, 53, 54, 55], [56, 57, 58, 59, 60], [61, 62, 63, 64, 65], [66, 67, 68, 69, 70], [71, 72, 73, 74, 75] ]) # Labels for the training data train_labels = np.array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]) # 7 test data points, each with 5 features test_data = np.array([ ]) [4, 5, 6, 7, 8], [9, 10, 11, 12, 13], [14, 15, 16, 17, 18], [19, 20, 21, 22, 23], [24, 25, 26, 27, 28], [29, 30, 31, 32, 33], [34, 35, 36, 37, 38] # We will consider the 3 nearest neighbors for classification k = 3 predicted_labels = knn_classifier (train_data, train_labels, test_data, k) print("Predicted labels: ", predicted_labels) # Example true Labels for the test data true_labels = np.array([0, 1, 0, 1, 0, 1, 0]) print("Accuracy: accuracy_score (true_labels, predicted_labels)) Write a Python function to implement the "k-Nearest Neighbors" algorithm using Numpy 1. Write a Python function that uses the k-Nearest Neighbors (k-NN) algorithm to classify a given set of test data based on a given set of training data and labels. The function should be named knn_classifier and it should take four arguments: A. train_data: A 2D numpy array where each row is a separate data point in the training dataset. B. train_labels: A 1D numpy array where each element is the class label of the corresponding row in train_data. C. test_data: A 2D numpy array where each row is a separate data point in the test dataset. The function should predict and return class labels for these data points. D. k: The number of nearest neighbors to be considered. The knn_classifier function should return test_labels, a 1D numpy array where each element is the predicted class label of the corresponding row in test_data. Use the Euclidean distance to calculate the distance between data points. The class of each test point is determined by the class of k closest training points (in terms of Euclidean distance). If there's a tie in the voting process, break the tie by choosing the class with the closest representative among the k nearest neighbors. You may use the k_closest_points and closest_labels functions from the previous question in your implementation. You are not allowed to use any external libraries except for Numpy and Python's standard library. 2. Write a function accuracy_score that calculates and returns the accuracy of the knn_classifier function's predictions. The function should take two parameters: true_labels: The actual labels of the test data. predicted_labels: The labels predicted by the knn_classifier function. The accuracy score is the proportion of correct predictions over total predictions. def knn_classifier (train_data, train_labels, test_data, k): predicted labels = [] for test_point in test_data: # Get Labels of k closest points in train_data to the current test point # Find the most common Label among the k closest ones #return the predicted Labels for test data def accuracy_score (true_labels, predicted_labels): return accuracy # 15 training data points, each with 5 features train_data = np.array([ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25], [26, 27, 28, 29, 30], [31, 32, 33, 34, 35], [36, 37, 38, 39, 40], [41, 42, 43, 44, 45], [46, 47, 48, 49, 50], [51, 52, 53, 54, 55], [56, 57, 58, 59, 60], [61, 62, 63, 64, 65], [66, 67, 68, 69, 70], [71, 72, 73, 74, 75] ]) # Labels for the training data train_labels = np.array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]) # 7 test data points, each with 5 features test_data = np.array([ ]) [4, 5, 6, 7, 8], [9, 10, 11, 12, 13], [14, 15, 16, 17, 18], [19, 20, 21, 22, 23], [24, 25, 26, 27, 28], [29, 30, 31, 32, 33], [34, 35, 36, 37, 38] # We will consider the 3 nearest neighbors for classification k = 3 predicted_labels = knn_classifier (train_data, train_labels, test_data, k) print("Predicted labels: ", predicted_labels) # Example true Labels for the test data true_labels = np.array([0, 1, 0, 1, 0, 1, 0]) print("Accuracy: accuracy_score (true_labels, predicted_labels))
Expert Answer:
Related Book For
Data Mining Concepts And Techniques
ISBN: 9780128117613
4th Edition
Authors: Jiawei Han, Jian Pei, Hanghang Tong
Posted Date:
Students also viewed these operating system questions
-
"internet radios" for streaming audio, and personal video recorders and players. Describe design and evaluation processes that could be used by a start-up company to improve the usability of such...
-
CANMNMM January of this year. (a) Each item will be held in a record. Describe all the data structures that must refer to these records to implement the required functionality. Describe all the...
-
In Problem 10.16, we projected financial statements for Walmart Stores for Years +1 through +5. The data in Chapter 12s Exhibits 12.1612.18 include the actual amounts for 2008 and the projected...
-
Assume that the multiplier in a country is equal to 4 and that autonomous real consumption spending is $1 trillion. If current real GDP is $15 trillion, what is the current value of real consumption...
-
Evaluate the integral. (ln x) 2 dx
-
George Oppenheimer, an agent for Wellington Farms of Massachusetts, Inc., had contacted Mark Kiriakou from the Capital Area Food Bank regarding an order for frozen turkey meat. In an exchange of...
-
Simonsen Village is a relatively new community. At the start of the year, the government had no long-term assets or liabilities. In fact, all of the infrastructure (e. g., roads) in the village are...
-
Consider the given vector field. F(x, y, z) = (x + yz)i + (y + xz)j + (z + xy)k (a) Find the curl of the vector field. curl F = (b) Find the divergence of the vector field. div F =
-
write up a brief Postscript , In this Postscript, reflect on the problem the characters dealt with in the play. Briefly state the problem and then consider: Would you characterize the problem as...
-
What three foreign-country entry vehicles are emphasized in this chapter?
-
The mayor of Sydney plans to provide housing subsidies to primary school teachers in an attempt to address the shortage of teachers. The subsidies are expected to reduce the cost of living in the...
-
Why are ethics and biases relevant to strategic decision making and strategic leadership?
-
How do time and causal ambiguity relate to the value, rarity, and inimitability of a resource or capability?
-
Coffee shops often have different numbers of employees working at different times of day. At peak times, they typically have more people scheduled to work. When the second person is added to the...
-
Indiana Company expects to receive 1 million euros in one year from exports. It can use any one of the following strategies to deal with the exchange rate risk. Estimate the dollar cash flows...
-
On April 29, 2015, Auk Corporation acquires 100% of the outstanding stock of Amazon Corporation (E & P of $750,000) for $1.2 million. Amazon has assets with a fair market value of $1.4 million (basis...
-
Regarding the computation of measures in a data cube: a. Enumerate three categories of measures, based on the kind of aggregate functions used in computing a data cube. b. For a data cube with the...
-
Autoencoder. Autoencoder is a classic type of neural networks for unsupervised learning. a. Demonstrate that PCA is a special case of the autoencoder by showing that the loss function of PCA is...
-
Discuss issues to consider during data integration.
-
What might be problematic about these responses to interview questions? How might the answers be improved? a. Q: Tell me about yourself. A: Im really easy-going and casual. b. Q: I noticed that you...
-
Revise the follow-up message below to be more professional and effective, based on the principles in this chapter. December 2, 2021 Ms. Charlotte LeClaire Pebble Creek Publishing Inc. New York, NY...
-
Write a follow-up email message or letter after an office visit or plant trip. Thank your hosts for their hospitality; relate your strong points to things you learned about the company during the...
Study smarter with the SolutionInn App