Question: How should I set up R code for this question? The files hotels_train.csv and hotels_test.csv contain data on tens of thousands of hotel stays from

How should I set up R code for this question?

The files hotels_train.csv and hotels_test.csv contain data on tens of thousands of hotel stays from a major U.S.-based hotel chain. The goal of this problem is simple: to use linear regression to build a machine-learning model for predicting whether a hotel booking will have children on it. Why would that be important? For an equally simple reason: when booking a hotel stay on a website, parents often enter the reservation exclusively for themselves and forget to include their children on the form. Obviously, the hotel isn't going to turn parents away from their room if they neglected to mention that their children would be staying with them. But not knowing about those children does, at least in the aggregate, prevent the hotel from making accurate forecasts of resource utilization. So if, for example, you could use the other features associated with a booking to forecast that a bunch of kids were going to show up unannounced, you might know to order more chicken nuggets for the restaurant and less tequila for the bar. (Or maybe more tequila, depending on how frazzled the parents who stay at your hotel tend to be.) In any event, as a hotel operator, if you can forecast the arrival of those kids a bit better, you can be just a bit more efficient, operationally speaking. This is an excellent use case for an ML model: a piece of software that can scan the bookings for the week ahead and produce an estimate for how likely each one is to have a "hidden" child on it.

The target variable of interest is children: a dummy variable for whether the booking has children on it. All other variables in the data set can be used to predict the children variable.

Please compare the out-of-sample performance (measuring using RMSE) of the following four models:

1. a small model that uses only the market_segment, adults, customer_type, and is_repeated_guest variables as features.

2. a big model that uses all the possible predictors except the arrival_date variable (main effects only).

3. a huge model that uses all the possible predictors except the arrival_date variable, along with all their possible pairwise interactions.

4. the big model (model 2 on this list), with one additional "engineered" feature: the month of the year, based on the arrival_date variable. (Remember our use of the lubridate package in R to doing this kind of feature engineering with dates.) Use the data in hotels_train.csv to fit the models.

Use the data in hotels_test.csv to calculate out-of-sample RMSE. Notes and requirements:

You don't need to report fitted model coefficients in your Results section. Really all your Results section needs to contain is a table with four rows (one for each model) and two columns (one for training-set RMSE, the other for test-set RMSE). Please report the RMSE numbers to four decimal places. Give the table an informative caption that describes what the table shows. Your Conclusions section should also be quite shortessentially just a recommendation about which model to use for predicting "hidden" children on hotel bookings.

It may take awhile for the huge model to fit on your machine. Be patient. 2

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

There are two problems due this week (each worth 35 points) as follows. Problem 1.6 (page 20) In comprehensive paragraphs, answerrequirements a to e. You will have 5 paragraphs total of four to five...

What is a great reflection for the reference article listed below? Reference article: Artificial Intelligence for the Real World Don't start with moon shots. In 2013, the MD Anderson Cancer Center...

Based on the article above, answer the following question. 1.What makes ransomware like NotPetya extremely dangerous? 2.What maybe the major motive(s) for its deployment? 3.What makes even big...

CHA P TER 9 Understanding Software: A Primer for Managers 1. INTRODUCTION L E A R N I N G O B J E C T I V E S 1. Recognize the importance of software and its implications for the rm and strategic...

There are two problems due this week (each worth 35 points) as follows. Case 5-1David L. Miller: Portrait of a White-Collar Criminal (page 144). In comprehensive paragraphs, answerrequirements 1?6....

Case 9-2 Continental A.G. Write a report of approximately 750 words that addresses the following points: Examine Continental?s financial statements for unusual accounting practices that may have a...

I'm an undergrad accounting student in an introduction to forensic accounting course.I need help getting started on a final project for this class over a fictitious company called the Grand Teton...

Report: Your report should include: I. NSU Cover page (1 page in length). It will present the name of the case and the author(s) of the report. II. Executive Summary (1 page in length). The Executive...

Please see attachment. All three question need to be answered in narrative format. If you have questions, just let me know. Normal requirement for references are 2 outside our course text....

chapter 6 \" International Management It was once said that the sun never set on the British Empire. Today, the sun does set on the British Empire, but not on the scores of global empires, including...

You are working as a junior law clerk at the law firm of Michael, Eliad & Redford LLP, Barristers and Solicitors, 863 Seneca Lane, Toronto, Ontario. Your immediate supervisor is Robert B. Redford, a...

Creams, liquid soaps and hair creams are used daily. Explain how these colloidal products are formed taking into account the states of aggregation of each of their phases.

What advice would you give Friedman on how to close the deal and integrate Adenza? Given the skepticism of Nasdaq shareholders and analysts, which factors does she most need to pay attention to ? Do...

Accounting, Analysis, and Principles On January 1, 2017, Agassi Corporation had the following stockholders' equity accounts. Common Stock ($10 par value, 60,000 shares issued and outstanding)...