Refer to the scenario in Problem 52 regarding the identification of movies that win the Best Picture

Question:

Refer to the scenario in Problem 52 regarding the identification of movies that win the Best Picture Oscar. Apply k-nearest neighbors to classify observations as winning best picture or not by using Winner as the target (or response) variable. Use 100% of the data for training and validation (do not use any data as a test set).

a. Based on all the input variables, determine the value of k that maximizes the AUC in a validation procedure.

b. Note that each year there is only one winner of the Best Picture Oscar. Knowing this, examine the confusion matrix and explain what is wrong with classifying a movie based on a cutoff value?

c. What is the best way to use the model to predict the annual winner?

Problem 52

Each year, the American Academy of Motion Picture Arts and Sciences recognizes excellence in the film industry by honoring directors, actors, and writers with awards (called “Oscars”) in different categories. The most notable of these awards is the Oscar for Best Picture. Data has been collected on a sample of movies nominated for the Best Picture Oscar. The variables include total number of Oscar nominations across all award categories, number of Golden Globe awards won (the Golden Globe award show precedes the Academy Awards), the genre of the movie, and whether or not the movie won the Best Picture Oscar award.

image text in transcribed

Apply logistic regression with lasso regularization to classify winners of the Best Picture Oscar by using Winner as the target (or response) variable. Use 100% of the data for training and validation (do not use any data as a test set).

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Business Analytics

ISBN: 9780357902219

5th Edition

Authors: Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W. Ohlmann

Question Posted: