Competitive Auctions on eBay.com. The file eBayAuctions.csv contains information on 1972 auctions transacted on eBay.com during MayJune

Question:

Competitive Auctions on eBay.com. The file eBayAuctions.csv contains information on 1972 auctions transacted on eBay.com during May–June 2004. The goal is to use these data to build a model that will distinguish competitive auctions from noncompetitive ones. A competitive auction is defined as an auction with at least two bids placed on the item being auctioned. The data include variables that describe the item (auction category), the seller (his or her eBay rating), and the auction terms that the seller selected (auction duration, opening price, currency, day of week of auction close). In addition, we have the price at which the auction closed. The goal is to predict whether or not an auction of interest will be competitive.

Data Preprocessing. Ensure that the categorical predictors are of Polynominal type: Category (18 categories), Currency (USD, GBP, Euro), EndDay (Monday– Sunday), and Duration (1, 3, 5, 7, or 10 days). (Note: You can choose to leave them as nominal attributes, since the Logistic Regression modeling operator in RapidMiner internally converts them into dummies by using the first occurring category in the data for each categorical attribute as its comparison group.)

a. Create pivot tables for the mean of the binary target attribute (Competitive?) as a function of the various categorical attributes (use the original attributes, not the dummies; also use the target attribute as the original 0/1 numerical attribute). Use the information in the pivot tables to reduce the number of categories that will be used in the model for each categorical attribute. For example, categories that appear most similar with respect to the distribution of competitive auctions could be combined. Remember that new attributes can be created in RapidMiner with the Generate Attributes operator.

b. Split the data into training (60%) and holdout (40%) datasets. Run a logistic model with all predictors with a threshold of 0.5.

c. If we want to predict at the start of an auction whether it will be competitive, we cannot use the information on the closing price. Run a logistic model with all predictors as above, excluding closing price. How does this model compare with the full model with respect to predictive accuracy?

d. Interpret the meaning of the coefficient for closing price. Does closing price have a practical significance? Is it statistically significant for predicting competitiveness of auctions? (Use a 10% significance level.)

e. Use regularized logistic regression with L1 penalty on the training data. Compare its selected predictors and classification performance with that of the complete model (excluding closing price).

f. If the major objective is accurate classification, what threshold value should be used?
g. Based on these data, what auction settings set by the seller (duration, opening price, ending day, currency) would you recommend as being most likely to lead to a competitive auction?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Machine Learning For Business Analytics

ISBN: 9781119828792

1st Edition

Authors: Galit Shmueli, Peter C. Bruce, Amit V. Deokar, Nitin R. Patel

Question Posted: