Approximately 50,000 new titles, including new editions, are published each year in the United States, giving rise

Question:

Approximately 50,000 new titles, including new editions, are published each year in the United States, giving rise to a $25 billion industry in 2001. In terms of percentage of sales, this industry may be segmented as follows:

image text in transcribedBook retailing in the United States in the 1970s was characterized by the growth of bookstore chains located in shopping malls. The 1980s saw increased purchases in bookstores stimulated through the widespread practice of discounting. By the 1990s, the superstore concept of book retailing gained acceptance and contributed to double-digit growth of the book industry. Conveniently situated near large shopping centers, superstores maintain large inventories of 30,000–80,000 titles and employ well-informed sales personnel. Book retailing changed fundamentally with the arrival of Amazon, which started out as an online bookseller and, as of 2015, was the world’s largest online retailer of any kind. Amazon’s margins were small and the convenience factor high, putting intense competitive pressure on all other book retailers. Borders, one of the two major superstore chains, discontinued operations in 2011.
Subscription-based book clubs offer an alternative model that has persisted, though it too has suffered from the dominance of Amazon.
Historically, book clubs offered their readers different types of membership programs. Two common membership programs are the continuity and negative option programs, which are both extended contractual relationships between the club and its members. Under a continuity program, a reader signs up by accepting an offer of several books for justa few dollars (plus shipping and handling)and an agreement to receive a shipment of one or two books each month thereafter at more-standard pricing. The continuity program is most common in the children’s book market, where parents are willing to delegate the rights to the book club to make a selection, and much of the club’s prestige depends on the quality of its selections.
In a negative option program, readers get to select how many and which additional books they would like to receive. However, the club’s selection of the month is delivered to them automatically unless they specifically mark “no” on their order form by a deadline date. Negative option programs sometimes result in customer dissatisfaction and always give rise to significant mailing and processing costs.
In an attempt to combat these trends, some book clubs have begun to offer books on a positive option basis but only to specific segments of their customer base that are likely to be receptive to specific offers. Rather than expanding the volume and coverage of mailings, some book clubs are beginning to use database-marketing techniques to target customers more accurately. Information contained in their databases is used to identify who is most likely to be interested in a specific offer. This information enables clubs to design special programs carefully tailored to meet their customer segments’ varying needs.

1. What is the response rate for the training data customers taken as a whole? What is the response rate for each of the 4× 5×3 = 60 combinations of 652 23 CASES RFM categories? Which combinations have response rates in the training data that are above the overall response in the training data?

 2. Suppose that we decide to send promotional mail only to the “aboveaverage” RFM combinations identified in part 1. Compute the response rate in the validation data using these combinations.

 3. Rework parts 1 and 2 with three segments: Segment 1: RFM combinations that have response rates that exceed twice the overall response rate Segment 2: RFM combinations that exceed the overall response rate but do not exceed twice that rate Segment 3: the remaining RFM combinations Draw the lift curve (consisting of three points for these three segments) showing the number of customers in the validation dataset on the x-axis and cumulative number of buyers in the validation dataset on the y-axis.

4. Use the k-NN approach to classify cases using Florence as the target attribute. Using a 10-fold cross-validation nested in the Optimize Parameters (Grid) operator, find the best k. Remember to normalize all five attributes. Create a lift chart for the best k model for the validation data. How many customers should be targeted in a sample of a size equal to the validation set size? 

5. The k-NN prediction algorithm gives a numerical value, which is a weighted average of the values of the Florence attribute for the k-NN with weights that are inversely proportional to distance. Using the best k that you calculated above with k-NN classification, now run a model with k-NN prediction, and compute a lift chart for the validation data.

6. Create lift charts for the validation set, summarizing the results from the three logistic regression models created above. For each model, report how many customers should be targeted in a test set of the same size.
7. If the threshold criterion for a campaign isa 30% likelihood ofa purchase, for each of the three logistic regression models, find the customers in the validation data that would be targeted, and count the number of buyers in this set.

8. Based on the above analysis, which model would you select for targeting customers with Florence as the target attribute? Why?
9. Test the “best” model on the holdout set. Create a lift chart, and comment on how this chart compares with the analysis done earlier with the validation set in terms of how many customers to target.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Machine Learning For Business Analytics

ISBN: 9781119828792

1st Edition

Authors: Galit Shmueli, Peter C. Bruce, Amit V. Deokar, Nitin R. Patel

Question Posted: