Question: Part 1 : Data Preparation and Feature Engineering Load the California Housing dataset. Handle Missing Data using an appropriate imputation technique to handle any missing
Part : Data Preparation and Feature Engineering
Load the California Housing dataset.
Handle Missing Data using an appropriate imputation technique to handle any missing values.
Feature Encoding: for any ordinal categorical variables, apply Label Encoding; for nominal categorical variables, use OneHot Encoding.
Normalize or standardize the numerical features.
Part : Dimensionality Reduction with PCA
Calculate the covariance matrix of the scaled data and find its eigenvalues and eigenvectors.
Use the eigenvalues to compute the principal components.
Select the number of principal components that preserve at least of the variance.
Part : Model Training and Evaluation
Train a simple classification model such as logistic regression or decision tree on the original dataset before applying PCA
Train the same classification model on the dataset after applying PCA.
Compare the performance of the baseline model and the PCAtransformed model using accuracy, precision, recall, and Fscore.
THERE IS californiahousingtest.csv and californiahousingtrain.csv
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
