Question: Create a file called features.py and implement the following functions. def train_test_split(X, y, test_size, shuffle, random_state=None) : X, y - features and the target variable.

Create a file called features.py and implement the following functions.

def train_test_split(X, y, test_size, shuffle, random_state=None) :

X, y - features and the target variable. test_size - between 0 and 1 - how much to allocate to the test set; the rest goes to the train set. shuffle - if True, shuffle the dataset, otherwise not. random_state, integer; if None, then results are random, otherwise fixed to a given seed. Example: X_train, X_test, y_train, y_test = train_test_split(feat_df, y, 0.3, True, 12)

create_categories(df, list_columns)

Converts values, in-place, in the columns passed in the list_columns to numerical values. Follow the same approach: "string" -> category -> code. Replace values in df, in-place.

X, y = preprocess_ver_1(csv_df)

Apply the feature transformation steps to the dataframe, return new X and y for entire dataset. Do not modify the original csv_df . Remove all rows with NA values Convert datetime to a number Convert all strings to numbers. Split the dataframe into X and y and return these.

https://www.kaggle.com/anthonypino/melbourne-housing-market Download Melbourne_housing_FULL.csv

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!