eBay AuctionsBoosting and Bagging. The file eBayAuctions.csv contains information on 1972 auctions that transacted on eBay.com during

Question:

eBay Auctions—Boosting and Bagging. The file eBayAuctions.csv contains information on 1972 auctions that transacted on eBay.com during May–June 2004. The goal is to use these data to build predictive models that will classify auctions as competitive or noncompetitive. A competitive auction is defined as an auction with at least two bids placed on the item auctioned. The data include attributes that describe the item (auction category), the seller (his/her eBay rating), and the auction terms that the seller selected (auction duration, opening price, currency, day of week of auction close). In addition, we have the price at which the auction closed. The task is to predict whether or not the auction will be competitive.

Data Preprocessing. Only those attributes that can be used for predicting the outcome of a new auction should be considered. Convert attribute Duration into a categorical attribute (nominal type). Convert attribute Competitive? to binominal type, and assign it the appropriate role in RapidMiner. Consider mapping class labels to meaningful labels using the Map operator as well. Split the data into training (60%) and validation (40%) datasets.

a. Run a classification tree, using the default settings of Decision Tree operator. Looking at the validation set, what is the overall accuracy? What is the lift on the first decile?

b. Run a boosted tree with the same predictors (use the Bayesian Boosting operator with Decision Tree operator as the base estimator). For the validation set, what is the overall accuracy? What is the lift on the first decile?

c. Run a bagged tree with the same predictors (use the Bagging operator with Decision Tree operator as the base estimator). For the validation set, what is the overall accuracy? What is the lift on the first decile?

d. Run a random forest (use the Random Forest operator). Compare the bagged tree to the random forest in terms of validation accuracy and lift on first decile. How are the two methods conceptually different?

Fantastic news! We've Found the answer you've been seeking!