Question: Load the hmeq _ small.csv data set as a data frame. Standardize the data set as a new data frame. Normalize the data set as

Load the hmeq_small.csv data set as a data frame.
Standardize the data set as a new data frame.
Normalize the data set as a new data frame.
Print the means and standard deviations of both the standardized and normalized data.
Ex: Using the first 100 rows, found in hmeq_sample.csv, the output is:
The means of df1 are LOAN 1.631348e-16
MORTDUE 1.276118e-18
VALUE -2.447266e-17
YOJ -8.732091e-17
CLAGE -1.036208e-16
CLNO -5.068409e-17
DEBTINC 9.188053e-17
dtype: float64
The standard deviations of df1 are LOAN 1.005141
MORTDUE 1.005797
VALUE 1.005420
YOJ 1.005666
CLAGE 1.005602
CLNO 1.005479
DEBTINC 1.017700
dtype: float64
The means of df2 are LOAN 0.671006
MORTDUE 0.358735
VALUE 0.299044
YOJ 0.292135
CLAGE 0.448986
CLNO 0.346377
DEBTINC 0.624927
dtype: float64
The standard deviations of df2 are LOAN 0.269531
MORTDUE 0.247183
VALUE 0.187587
YOJ 0.237945
CLAGE 0.226345
CLNO 0.188681
DEBTINC 0.222946
dtype: float64
import pandas as pd
from sklearn.preprocessing import StandardScaler, Normalizer
# Read data
hmeq = pd.read_csv('hmeq_small.csv')
# Fill NaNs
hmeq = hmeq.fillna(hmeq.mean())
# Standardize
from sklearn.preprocessing import StandardScaler
standardized = StandardScaler().fit_transform(hmeq)
df1= pd.DataFrame(standardized, columns=hmeq.columns)
# Normalize
from sklearn.preprocessing import Normalizer
normalized = Normalizer().fit_transform(hmeq)
df2= pd.DataFrame(normalized, columns=hmeq.columns)
# Print statistics in correct order
print(df1.mean())
print(df1.std())
print(df2.mean())
print(df2.std())

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!