Question: Phase 1 : Data Preparation 1 - 1 ) Show your device directory 1 - 2 ) Import Required Packages 1 - 3 ) Load
Phase : Data Preparation Show your device directory Import Required Packages Load Data and show the first Records Show the dimension of the Data Frame Show the Column name of the Data Frame Modify Variable Names replace any white spaces with underscore and characters with nothing Phase : Pivot Table Create pivot tables for the mean of the binary outcome Competitive as a function of the various categorical variables as requested below: use the original variables, not the dummies Create a Pivot Table of Competitive and Category Combine "BusinessIndustrials and "Computers" Categories Combine "AntiqueArtCraft and "Collectibles" Categories Create a Pivot Table of Competitive and Combined Categories Create a Pivot Table of Competitive and Currency Create a Pivot Table of endDay and Competitive Combine ending days "Sunday" and "Friday" Create a Pivot Table of Combined endDay and Competitive Phase : Logistic Regression Full Model: Model with all Predictors Check Variables' Data Types and modify Data Types as required Convert the format of Categorical variables to Category Convert Variables 'Category', 'currency', & 'endDay' from Object to Category Check Variables' Data Types again to make sure the conversions are applied Define Dummy Variables for Categorical Variables Show the Column name of the Data Frame Define and print Predictors & Outcome Variable Partition Data into the Training & Validation Sets Fit a Logistic Regression Model Initiate Logistic Regression set penaltyl and Ce to avoid regularization Fit Logistic Regression into Training Set Get Intercept, Coefficients, & AIC Score Show the output Page of Phase : Logistic Regression Full Model: Model with all Predictors Contd Build a Logistic Regression Equation based on the Predictors Coefficients Write the equation and then export it in the code Predict Outcome Variable "Competitive" Predict Outcome Variable "Competitive" on Training Set Predict Outcome Variable "Competitive" on Validation Set Compute the performance of the Logistic Regression Model for Training & Validation sets. Create Class Names Create the Confusion Matrix for Training set Show the output Create the Confusion Matrix for Training set Show the output Compare Training & Validation sets performance. What is your interpretation? Compare the outputs of Performance Metrics on Training & Validation Sets and interpret how well the MLR Model with all predictors performs. Keep track of Accuracy Values for different Models Phase Reduced Model: Model without "ClosePrice" Variable At the beginning of an Auction, if we want to predict whether the Auction will be "Competitive", we cannot use the information on the closing price. Run a Logistic Model with all predictors but Close Price. Explain how this Model compares to the Full Model with respect to predictive accuracy. Drop Predictor 'ClosePrice' from Training & Validation Sets Initiate Logistic Regression for "Reduced" Logistic Regression Model Fit Logistic Regression into "Reduced" Training Set Get Intercept, Coefficients, & AIC Score for "Reduced" Data Frame Interpret the meaning of the coefficient for Closing Price. Does the Closing Price have a practical significance? Is it statistically significant for predicting the competitiveness of auctions? Predict the Outcome Variable "Competitive" on the Validation Set for the "Reduced" Model Page of Phase Reduced Model: Model without "ClosePrice" Variable Cont'd Compute performance of the "Reduced" Logistic Regression Model for Training & Validation sets. Create the Confusion Matrix of the "Reduced" Logistic Regression Model for the Training set Show the output Create the Confusion Matrix "Reduced" Logistic Regression Model for Validation set Show the output Explain how "Reduced" Model compares to the "Full" Model with respect to predictive accuracy Keep track of Accuracy Values for different Models Full & "Reduced" Models Interpret the output Phase : Find the Best Fitting Model Use Stepwise Regression to find the Model with the best fit to the Training Set highest Accuracy Which Predictors are used? Compute performance of the "Best Fitting" Logistic Regression Model for Training & Validation sets. Create the Confusion Matrix of the "Best Fitting" Logistic Regression Model for the Training set Show the output Create the Confusion Matrix of the "Best Fitting" Logistic Regression Model for the Validation set Show the output Keep track of Accuracy Values for different Models Full "Reduced" & "Best Fitting" Models Interpret the output Phase : Find the Best Predictive Model Use Stepwise Regression to find the Model with the highest accuracy on the Validation Set. Which Predictors are used? Meaning: Based on the Stepwise Regression Method, which predictors you will suggest for the Classification of a New Record? Compute the performance of the "Best Predictive" Logistic Regression Model for Training & Validation sets. Create the Confusion Matrix of the "Best Predictive" Logistic Regression Model for the Training set Show the output Create the Confusion Matrix of the "Best Predictive" Logistic Regression Model for the Validation set Show the output Page of Phase : Find the Best Predictive Model Cont'd Keep track of Accuracy Values for different Models Full "Reduced", "Best Fitting" & "Best Predictive" Models Interpret the output Build a Logistic Regression Equation based on the Predictors Coefficients for "Best Predictive" Logistic Regression Model
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
