Question: Create a function, preprocess _ data, which performs data preprocessing for a classification task. The function will preprocess the data by performing the following steps:
Create a function, preprocessdata, which performs data preprocessing for a classification task. The function will preprocess the data by performing the following steps: For categorical variables: replace NaN values with the most frequent value; create dummy variables based on levels all presented values and drop the first one in alphabetical order. Name the new binary columns using this schema: name of categorical variable level name. For numerical variables: replace NaN values with the median; standardize values by subtracting the mean and dividing by the standard deviation. For the target variable: convert text values into integers, so that the first text value alphabetically is converted to and so on The preprocessdata function accepts one argument: dataframe pandas DataFrame where target is a classification label and other variables are explanatory variables. The function returns a tuple X y where: X is a pandas DataFrame obtained after performing the preprocessing of numerical and categorical variables and after dropping the target column; y is a list of values of the target variable after preprocessing. Example For this sort of data: target married degree Isalary loccupation the preprocessdata function should return the following tuple: All changes saved deg F Cloudy Q Search To lea Tes male nurse I female NaN nurse I female policeman I male fireman male NaN
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
