Question: Project: Ensemble Techniques - Travel Package Purchase Prediction Description Background and Context You are a Data Scientist for a tourism company named Visit with us.

Project: Ensemble Techniques - Travel Package Purchase Prediction

Description

Background and Context

You are a Data Scientist for a tourism company named "Visit with us". The Policy Maker of the company wants to enable and establish a viable business model to expand the customer base.

A viable business model is a central concept that helps you to understand the existing ways of doing the business and how to change the ways for the benefit of the tourism sector.

One of the ways to expand the customer base is to introduce a new offering of packages.

Currently, there are 5 types of packages the company is offering - Basic, Standard, Deluxe, Super Deluxe, King. Looking at the data of the last year, we observed that 18% of the customers purchased the packages.

However, the marketing cost was quite high because customers were contacted at random without looking at the available information.

The company is now planning to launch a new product i.e. Wellness Tourism Package. Wellness Tourism is defined as Travel that allows the traveler to maintain, enhance or kick-start a healthy lifestyle, and support or increase one's sense of well being.

However, this time company wants to harness the available data of existing and potential customers to make the marketing expenditure more efficient.

You as a Data Scientist at "Visit with us" travel company has to analyze the customers' data and information to provide recommendations to the Policy Maker and Marketing Team and also build a model to predict the potential customer who is going to purchase the newly introduced package.

Objective

To predict which customer is more likely to purchase the long term travel package.

Data Dictionary

Customer details:

1.CustomerID: Unique customer ID

2.ProdTaken: Product taken flag

3.Age: Age of customer

4.PreferredLoginDevice: Preferred login device of the customer in last month

5.CityTier: City tier

6.Occupation: Occupation of customer

7.Gender: Gender of customer

8.NumberOfPersonVisited: Total number of person came with customer

9.PreferredPropertyStar: Preferred hotel property rating by customer

10.MaritalStatus: Marital status of customer

11.NumberOfTrips: Average number of the trip in a year by customer

12.Passport: Customer passport flag

13.OwnCar: Customers owns a car flag

14. NumberOfChildrenVisited: Total number of children visit with customer

15.Designation: Designation of the customer in the current organization

16.MonthlyIncome: Gross monthly income of the customer

Customer interaction data:

1.PitchSatisfactionScore: Sales pitch satisfactory score

2.ProductPitched: Product pitched by a salesperson

3.NumberOfFollowups: Total number of follow up has been done by sales person after sales pitch

4.DurationOfPitch: Duration of the pitch by a salesman to customer

Submission Guidelines :

1.Two files to be submitted:

1.A well commented Jupyter notebook [format - .ipynb]

2.File converted to HTML format

Questions:

1

Perform an Exploratory Data Analysis on the data

- Univariate analysis - Bivariate analysis - Use appropriate visualizations to identify the patterns and insights - Come up with a customer profile (characteristics of a customer) of the different packages - Any other exploratory deep dive

2

Illustrate the insights based on EDA

Key meaningful observations on the relationship between variables

3

Data Pre-processing

Prepare the data for analysis - Missing value Treatment, Outlier Detection(treat, if needed- why or why not ), Feature Engineering, Prepare data for modeling

4

Model building - Bagging

- Build bagging classifier, random forest and decision tree.

5

Model performance evaluation and improvement

- Comment on which metric is right for model performance evaluation and why? - Comment on model performance - Can model performance be improved? check and comment

6

Model building - Boosting

- Build Adaboost, gradient boost, xgboost and stacking classifier

7

Model performance evaluation and improvement

- Comment on which metric is right for model performance evaluation and why? - Comment on model performance - Can model performance be improved? check and comment

8

Actionable Insights & Recommendations

- Compare models - Business recommendations and insights

LINK TO CSV: https://drive.google.com/drive/folders/1tf-pMhdxbZDq1dC0QyA17fKXVvKhuDTc?usp=sharing

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!