Question: R code only - The file eBayAuctions.csv contains information on 1 9 7 2 auctions conducted on eBay.com between May and June 2 0 0

R code only - The file eBayAuctions.csv contains information on 1972 auctions conducted on eBay.com between May and June 2004. The objective is to utilize this dataset to construct a model for distinguishing competitive auctions from non-competitive ones. A competitive auction is defined as an auction where at least two bids are placed on the auctioned item. The dataset encompasses variables describing the auctioned item (auction category), the seller (their eBay rating), and the seller-selected auction terms (auction duration, opening price, currency, day-of-week of auction closure). Additionally, the dataset includes the closing price of each auction. Notes: Please be aware that the original variables of Category (consisting of 11 categories), Currency (USD, not USD), and EndDay (Weekend, Weekdays) are categorical. Consequently, the dataset includes corresponding dummy variables for each category. Additionally, it's important to note that there are only 10 dummy variables allocated for Category to avoid multicollinearity issues. This is because the 11th category can be represented as none of the 10 existing categories. 1. Import the dataset. Remove the original variables Category, Currency, EndDay from the imported dataset because we already have their corresponding dummy variables. (1 point)2. Split the data into training and validation datasets using a 60%-40% ratio. (1 point)3. Fit a classification tree. Use Competitive as the target variable and the rest of the variables as predictors. (As mentioned in the notes, you dont have to exclude one dummy variable from each dummy group for a categorical variable). To avoid overfitting, set the maxdepth=6. a. ReporProblems The file eBayAuctions.csv contains information on 1972 auctions conducted on eBay.com between May and June 2004. The objective is to utilize this dataset to construct a model for distinguishing competitive auctions from non-competitive ones. A competitive auction is defined as an auction where at least two bids are placed on the auctioned item. The dataset encompasses variables describing the auctioned item (auction category), the seller (their eBay rating), and the seller-selected auction terms (auction duration, opening price, currency, day-of-week of auction closure). Additionally, the dataset includes the closing price of each auction. Notes: Please be aware that the original variables of Category (consisting of 11 categories), Currency (USD, not USD), and EndDay (Weekend, Weekdays) are categorical. Consequently, the dataset includes corresponding dummy variables for each category. Additionally, it's important to note that there are only 10 dummy variables allocated for Category to avoid multicollinearity issues. This is because the 11th category can be represented as none of the 10 existing categories. 1. Import the dataset. Remove the original variables Category, Currency, EndDay from the imported dataset because we already have their corresponding dummy variables. (1 point)2. Split the data into training and validation datasets using a 60%-40% ratio. (1 point)3. Fit a classification tree. Use Competitive as the target variable and the rest of the variables as predictors. (As mentioned in the notes, you dont have to exclude one dummy variable from each dummy group for a categorical variable). To avoid overfitting, set the maxdepth=6. a. Report the tree - plot the tree and copy and paste the resulting diagram. You dont have to care too much about the aesthetics of the diagram. (1 point) b. List the decision rules. For example, if variable1<0 AND variable2<2, class=0.(0.5 points) c. Report the prediction confusion matrix of validation data. (0.5 points) d. List the predictors used by the tree. (0.5 points)4. Are the rules practical for predicting the outcome of a new auction? Explain why (Hint: Can you use the rules to classify a new auction before the auction ends? In other words, do you have all the necessary predictor values before the auction ends? Some of them may not be known before the end of the auction. What are those variables?). In short, which variables should NOT be included in the predictor set? (0.5 points) Explain why. (0.5 points)5. Fit another classification tree using the same setting in question 3. This time, use only the predictors that can be available for predicting the outcome of a new auction before the auction ends. a. Report the tree - plot the tree and copy and paste the resulting diagram. You dont have to care too much about the aesthetics of the diagram. (1 point) b. List the decision rules. For example, if variable1<0 AND variable2<2, class=0.(0.5 points) c. Report the prediction confusion matrix of validation data. (0.5 points) d. List the predictors used by the tree. (0.5 points)6. Compare the overall performance (e.g., accuracy or error rates) of the two decision trees (from Q3 and Q5). Which model has better predictive performance? (1 point) Explain why.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!