Question: Create a function, preprocess _ data, which performs data preprocessing for a classification task. The function will preprocess the data by performing the following steps:

Create a function, preprocess_data, which performs data preprocessing for a classification task. The function will preprocess the data by performing the following steps: 1. For categorical variables: replace NaN values with the most frequent value; create dummy variables based on levels (all presented values) and drop the first one in alphabetical order. Name the new binary columns using this schema: name of categorical variable +-+ level name. 2. For numerical variables: replace NaN values with the median; standardize values by subtracting the mean and dividing by the standard deviation. 3. For the target variable: convert text values into integers, so that the first text value alphabetically is converted to 0 and so on. The preprocess_data function accepts one argument: dataframe - pandas DataFrame where target is a classification label and other variables are explanatory variables. The function returns a tuple (X, y), where: X is a pandas DataFrame obtained after performing the preprocessing of numerical and categorical variables and after dropping the target column; y is a list of values of the target variable after preprocessing. Example For this sort of data: |target married |degree Isalary loccupation| the preprocess_data function should return the following tuple: -> All changes saved 38\deg F Cloudy Q Search To lea Tes male 01111.5 nurse I female 1112 NaN nurse I female 111312.3 policeman I male 111212.0|fireman male 101311.5 NaN

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!