Question: 1. Read the data description available on Kaggle. https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction 2. Add the following code to load the data. import numpy as np import pandas as
1. Read the data description available on Kaggle.
https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction

2. Add the following code to load the data.
import numpy as np import pandas as pd from sklearn.model_selection import train_test_split target = ["normal (negative)", "heart disease (positive)"] X = pd.read_csv('heart.csv') X_train, X_test = train_test_split(X, random_state=12, train_size=800, shuffle=True)
Use the training and test sets obtained from this step. Note that X_train and X_test are pandas dataframes.
3. Preprocess the data using the techniques learned in this course. Follow these rules when preprocessing the data.
Any preprocessing made on the training set must be also applied on the test set.
The test set values should not be read. For instance, suppose you want to replace a categorical value with its frequency. You should only use the training set to calculate the frequency of that value. Then make the replacement in the training and test sets.
After all preprocessing is complete, the target (i.e. 'HeartDisease') must be the last column.
4. Once all training and test set values are numerical or boolean, use the following code to convert the pandas dataframes to numpy arrays, and to store the features and targets in separate variables.
X_train = X_train.to_numpy().astype('float') X_test = X_test.to_numpy().astype('float') X_train, y_train = X_train[:, :-1], X_train[:, -1] X_test, y_test = X_test[:, :-1], X_test[:, -1]
5. Train a model using a scikit-learn classifier, then make predictions on the test set.
6. Show the model's performance by displaying a confusion matrix. And calculate the accuracy, precision, and recall of the model.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
