Question: USING PYTHON PLEASE: Competitive Auctions on eBay.com. The file eBayAuctions.csv contains information on 1 9 7 2 auctions transacted on eBay.com during May June 2

USING PYTHON PLEASE:
Competitive Auctions on eBay.com. The file eBayAuctions.csv contains information on 1972 auctions transacted on eBay.com during MayJune 2004. The goal is to use these data to build a model that will distinguish competitive auctions from noncompetitive ones. A competitive auction is defined as an auction with at least two bids placed on the item being auctioned. The data include variables that describe the item (auction category), the seller (his or her eBay rating), and the auction terms that the seller selected (auction duration, opening price, currency, day of week of auction close). In addition, we have the price at which the auction closed. The goal is to predict whether or not an auction of interest will be competitive.
Data preprocessing. Create dummy variables for the categorical predictors. These include Category (18 categories), Currency (USD, GBP, Euro), EndDay (MondaySunday), and Duration (1,3,5,7, or 10 days).
(a) Create pivot tables for the mean of the binary outcome (Competitive?) as a function of the various categorical variables (use the original variables, not the dummies). Use the information in the tables to reduce the number of dummies that will be used in the model. For example, categories that appear most similar with respect to the distribution of competitive auctions could be combined. After you create pivot tables, combine the following categories to reduce the number of dummy variables for logistic regression:
Sun, Wed, Fri for "endDay".
"Business/Industrial", "Computer", and "Home/Garden" for 'Category'.
"Antique/Art/Craft" and 'Collectibles' for 'Category'.
"Automotive" and 'Pottery/Glass' for 'Category'.
"Books" and 'Clothing/Accessories' for 'Category'.
(b) Split the data into training (60%) and validation (40%) datasets. Using Statsmodels and LogisticRegression(penalty="l2", C=1e42, solver='liblinear', tol=1e-28), fit a logit model (i.e., logistic regression) with all predictors with a cutoff probability of 0.5.
(c) Interpret the meaning of the coefficient for closing price. Does closing price have a practical significance? Is it statistically significant for predicting competitiveness of auctions? (Use a 10% significance level.) Interpret the meaning of the coefficient for closing price and quantify the effect of closing price using odds.
(d) If we want to predict at the start of an auction whether it will be competitive, we cannot use the information on the closing price. Using Statsmodels and LogisticRegression(penalty="l2", C=1e42, solver='liblinear', tol=1e-28), fit a logit model with all predictors as above, excluding closing price. How does this model compare to the full model with respect to predictive accuracy?
(e) Fit a regularized logit model with L1 penalty on the training data using the sklearn function LogisticRegressionCV(). Compare its selected predictors and classification performance to the model in (d).
(f) What auction settings set by the seller (duration, opening price, ending day, currency) would you recommend as being most likely to lead to a competitive auction?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!