Question: How can I Use R code to solve this . The Mailing Experiment Tayko has supplied its customer list of 2 0 0 , 0

How can I Use R code to solve this
. The Mailing Experiment
Taykohas supplied its customer list of 200,000 names to the pool, which totals over 5 million names, so it is now entitled to draw 200,000 names for a mailing.Taykowould like to select the names that have the best chance of performing well, so it first conducts atest prior to withdrawing its full allotment of names itdraws 20,000 names from the pool and does a test mailing of the new catalog.
This mailing yielded 1,065 purchasers, a response rate of 0.053, or 5.3%. To optimize the performance of the data mining techniques, it was decided to work with a stratified sample that contained equal numbers of purchasers andnonpurchasers. For ease of presentation, the dataset for this case includes just 1,000 purchasers and 1,000nonpurchasers, an apparent response rate of 0.5, or 50%. Therefore, after using the dataset to predict who will be a purchaser, we must adjust the purchase rate back down by multiplying each case's probability of purchase by 0.053/0.5, or 0.107.
Data
There are two response variables in this case.Purchase indicates whether or not a prospect responded to the test mailing and purchased something.Spendingindicates, for those who made a purchase, how much they spent.
The overall procedure in this case will be to develop two models. The first model will be used to classify records as Purchase or No purchase. This model will be used to identify the most likely purchasers in the consortiums database. The second model will be used for those cases that are classified as having made a purchaseand will predict the amount purchasers spend. By identifying the most likely responders and the highest spenders Tayko hopes to identify names from the consortiums pool that will increase profit above a random draw of names.
Table 21.7 in the textbook provides a description of the variables in the Tayko dataset.
A catalog costs $2 to mail, has response rate of 0.053 or 5.3%, and the average spend you can determine based on purchasers in a test database. How do I estimate the gross profit that a company could expect from the remaining names if selected them randomly from the pool, using R Studio?
Assignment Details
1. Each catalog costs $2 to mail (including printing, postage, and mailing costs).
a. Based on this cost, the response rate from the test, and the average spend you can determine based on the purchasers in the dataset, estimate the gross profit that the firm could expect from the remaining 180,000 names if it selected them randomly from the pool.
2. The first model you will develop is a model for classifying a customer as a purchaser ornonpurchaser. This will identify the most likely responders to a mailing.
a. Partition the data randomly into a training set (800 records),validation set (700 records), andtest set (500 records).
b.Run a logistic regression model on the training set using backward elimination to select the best subset of variables. Exclude Spending as it is a dependent variable.
c. Assess your model results by using confusion matrices for the training and validation data, and the decile lift chart on the validation data. Do you feel this is an adequate model to move forward? Support your assessment with specific accuracy and lift information.
3. The second model you will develop is a model for predicting spending among the purchasers. This will identify the highest spenders.
a. Create a vector of IDs for only the purchaser records (Purchase =1).
b. Partition this dataset into the training and validations records. (Use the same training/validation labels from the earlier partitioning; one way is to use function intersect() to find IDs of purchasers in the original partitions).
c.Develop models for predicting spending, using:
i.Multiple linear regression using backward selection
ii.Regression trees (with default parameters which will prune tree)
d. Choose one model on the basis of its performance on the validation data. What is
your rationale for choosing the model you chose?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!