Question: Describe your data preparation steps, such as: Handling missing values, duplicates, or outliers Feature scaling or normalization (especially important for PCA and clustering) Label encoding

Describe your data preparation steps, such as:

  1. Handling missing values, duplicates, or outliers
  2. Feature scaling or normalization (especially important for PCA and clustering)
  3. Label encoding for categorical variables
import matplotlib.pyplot as pltimport numpy as np import pandas as pd df = pd.read_csv('student_prediction.csv')df.head()
Describe your data preparation steps, such as:Describe your data preparation steps, such as:
studentid 0 STUDENT 1 STUDENT2 2 STUDENTS 3 STUDENT4 4 STUDENTS 5 rows * 33 columns 22- 25 22- 25 22- 25 18- 21 22- 35 gender Male hs_type scholarship Other Other State Private Private 50% 50% 50% 50% 30% work activity Yes No Yes No No No Yes No No No partner No Yes Yes salary $135- 200 $135- 200 $201- 270 $201- 270 $271- transport Bus Bus Other Bus prep_study Alone Alone Alone Alone With friends prep_exam Closest date to the exam Closest date to the exam Closest date to the exam Regularly during the semester Closest date to the exam notes Always Always Sometimes Always Sometimes listens Sometimes Sometimes Sometimes Sometimes Sometimes , likes_discuss Never Always Never Sometimes Sometimes dassroom Useful Useful Not useful Not useful Not useful df . info() RangeIndex: 145 entries, 0 to 144 Data columns (total 33 columns) : Column Non-Null Count Dtype studentid 145 non-null object 1 age 145 non-null object 2 gender 145 non-null object 3 hs_type 145 non-null object 4 Scholarship 145 non-null object 5 work 145 non-null object 6 activity 145 non-null object 7 partner 145 non-null object 8 salary 145 non-null object 9 transport 145 non-null object 10 living 145 non-null object 11 mother_edu 145 non-null object 12 father_edu 145 non-null object 13 # siblings 145 non-null int64 14 kids 145 non-null object 15 mother_job 145 non-null object 16 father_job 145 non-null object 17 study_hrs 145 non-null object 18 read_freq 145 non-null object 19 read_freq_sci 145 non-null object 20 attend_dept 145 non-null object 21 impact 145 non-null object 22 attend 145 non-null object 23 prep_study 145 non-null object 24 prep_exam 145 non-null object 25 notes 145 non-null object 26 listens 145 non-null object 27 likes_discuss 145 non-null object 28 classroom 145 non-null object 29 cuml_gpa 145 non-null object 30 exp_gpa 145 non-null object 31 course id 145 non-null int64 32 grade 145 non-null int64 dtypes: int64(3), object(30) memory usage: 37.5+ KB

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!