using R language Scenario 1: Consider the hotels dataset that contains the following columns for all the
Question:
using R language
Scenario 1: Consider the hotels dataset that contains the following columns for all the hotels in the entire world:
Column Description
Hotel_id Hotel's unique id
Hotel_name Hotel's name
Hotel_city The city name that hotel is in
Hotel_country_code 2-letter code of the country that hotel is in (e.g., for France, 2 letter code is FR)
Latitude. Latitude of the hotel's location
Longitude Longitude of the hotel's location
The dataset has 400,000 rows (hotels). For simplicity, assume that there are no rows or columns with NULL values. However, latitude, longitude or hotel_country_code columns might contain incorrect values. According to some analysis you are told that approximately 5% of the dataset (20,000 hotels) has incorrect hotel_country_code values, and only 1% of the dataset (4,000 hotels) has incorrect latitude or longitude values. This dataset is going to be used for very important project. Therefore, incorrect hotel_country_code values should be found and corrected first.
Question 1: What approach would you implement to correct those incorrect hotel_country_codes?
Question 2: If you think the columns in the given dataset is not enough to solve the problem, and you might need additional data, what data do you think you would need, and why?