Question: Q 1 . import numpy as np import pandas as pd from sklearn.datasets import fetch _ openml from sklearn.preprocessing import StandardScaler from sklearn.impute import KNNImputer
Q
import numpy as np
import pandas as pd
from sklearn.datasets import fetchopenml
from sklearn.preprocessing import StandardScaler
from sklearn.impute import KNNImputer
from sklearn.neighbors import LocalOutlierFactor
Load the diabetes dataset
diabetes fetchopenmldiabetes version asframeTrue
diabetesdf diabetes.frame
Create a descriptive statistics for the dataset
printdiabetesdfdescribe
Normalize the columns 'age' and 'insu'
scaler StandardScaler
diabetesdfcolumnstonormalize scaler.fittransformdiabetesdfcolumnstonormalize
Detect outliers in two columns 'age' and 'insu' using LOF or Isolation forest
diabetesdfoutlier lof.fitpredictdiabetesdfcolumnsforoutlierdetection
Filter out rows identified as outliers rows with outliers will be
filtereddf diabetesdfdiabetesdfoutlier
Print the length of filtered dataset with and without outliers
printfLength of dataset with outliers: lendiabetesdf
printfLength of dataset without outliers: lenfiltereddf
Introduce missing values in the 'mass' and 'insu' columns and print the missing indices in those columns
nprandom.seed
diabetesdfmasswithmissing' diabetesdfmasscopy
diabetesdfmasswithmissing'locnprandom.choicediabetesdfindex, size npnan
diabetesdfinsuwithmissing' diabetesdfinsucopy
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
