Question: Part 1: Collect, Explore, and Prepare the Data For this model you will need to prepare three sets of data. The easiest way is to
Part 1: Collect, Explore, and Prepare the Data For this model you will need to prepare three sets of data. The easiest way is to have each on one sheet in the same workbook: - Crime Data for New York City - The most recent crime data for New York City a minimum of 12 months of data (include the latest available data). You are responsible for deciding which crime stats to include in the model, but it would be a good idea to include, at a minimum, the seven major felony offenses. - Training Data for Price Prediction - the latest New York City Airbnb data combined with any appropriate crime data (from the previous point). This dataset will have all data from the provided Airbnb data file combined with your chosen attributes from the crime data analysis. This step will likely take you the longest to perform. Don't breeze through this part of the task. - Subject (Test) Data - while not technically a dataset on its own, you will need to prepare a subject dataset that includes the details for three (3) prospective properties that you've. identified as good potential buys for your parents. Your model will be used to predict one-night rental prices for these three properties. Part 2: Create a Predictive Model Once your data has been collected and prepared, you will need to decide what data to include (based on your previous exploration and analysis) for training the mining models. Any and all of the tools we looked at in the course are available to you (basically anything you've done or can do with Excel or Orange). There is no explicit requirement for you to do certain tasks in Excel or Orange, use whichever tool works best for you and your group. You may find that as you proceed in building your model that data needs to be added or removed from your initial worksheet. You may also choose to use other techniques such as normalization and binning to create a more accurate model. As you are going through this process, you must take note of what method you are using, what changes you make to the model data, and why you are making those decisions. You will need to present both your model, and the reasoning for why you built it as you did and why it is superior to the alternatives that proved to be less accurate (e.g. pay attention to error/performance measures). The process of developing your model is the most important part of this process, so ensure you are making logical improvements and documenting the reasoning and impact. After hearing about how much you've been learning about data analysis in your CMIS2250 classes, your highly successful investor parents decided to approach you to perform some data analysis for a project they have in mind. Your parents are looking to expand their real estate portfolio by investing in the US, and they'd like to begin by entering the hot New York City market. Specifically, your folks have heard a lot about how investors have had success letting out property via Airbnb, and they'd like to get some exposure to this market. They have a US\$1.38 million budget for purchasing a property, and they're of course looking to make as much of a return as possible on their investment. Your folks are relying on you to perform all the work necessary for them to make a decision. Some general considerations to keep in mind are: - You will have to identify areas of the city where their budget will allow them entry, and identify a suitable property for them to purchase don't worry about specific properties available, simply identify what an 'ideal' property for them to purchase would look like (i.e., identify the neighborhood, and general features such as number of bedrooms, number of bathrooms, is it a condo, asking price, etc.) - Using the New York city Airbnb data, they've provided you, build a modelfor predicting onenight rates for properties in the city - Your folks will also use the property for themselves and guests for up to five weeks of every year, and therefore, they would like to purchase in a safe neighborhood - You will have to collect crime statistics for the city and include this in your modeling for price prediction - Also prepare a separate analysis (data exploration) of the crime in the city - Finally, your parents are not very tech savvy, and would like you to present your findings in a formal report for them to review Part 1: Collect, Explore, and Prepare the Data For this model you will need to prepare three sets of data. The easiest way is to have each on one sheet in the same workbook: - Crime Data for New York City - The most recent crime data for New York City a minimum of 12 months of data (include the latest available data). You are responsible for deciding which crime stats to include in the model, but it would be a good idea to include, at a minimum, the seven major felony offenses. - Training Data for Price Prediction - the latest New York City Airbnb data combined with any appropriate crime data (from the previous point). This dataset will have all data from the provided Airbnb data file combined with your chosen attributes from the crime data analysis. This step will likely take you the longest to perform. Don't breeze through this part of the task. - Subject (Test) Data - while not technically a dataset on its own, you will need to prepare a subject dataset that includes the details for three (3) prospective properties that you've. identified as good potential buys for your parents. Your model will be used to predict one-night rental prices for these three properties. Part 2: Create a Predictive Model Once your data has been collected and prepared, you will need to decide what data to include (based on your previous exploration and analysis) for training the mining models. Any and all of the tools we looked at in the course are available to you (basically anything you've done or can do with Excel or Orange). There is no explicit requirement for you to do certain tasks in Excel or Orange, use whichever tool works best for you and your group. You may find that as you proceed in building your model that data needs to be added or removed from your initial worksheet. You may also choose to use other techniques such as normalization and binning to create a more accurate model. As you are going through this process, you must take note of what method you are using, what changes you make to the model data, and why you are making those decisions. You will need to present both your model, and the reasoning for why you built it as you did and why it is superior to the alternatives that proved to be less accurate (e.g. pay attention to error/performance measures). The process of developing your model is the most important part of this process, so ensure you are making logical improvements and documenting the reasoning and impact. After hearing about how much you've been learning about data analysis in your CMIS2250 classes, your highly successful investor parents decided to approach you to perform some data analysis for a project they have in mind. Your parents are looking to expand their real estate portfolio by investing in the US, and they'd like to begin by entering the hot New York City market. Specifically, your folks have heard a lot about how investors have had success letting out property via Airbnb, and they'd like to get some exposure to this market. They have a US\$1.38 million budget for purchasing a property, and they're of course looking to make as much of a return as possible on their investment. Your folks are relying on you to perform all the work necessary for them to make a decision. Some general considerations to keep in mind are: - You will have to identify areas of the city where their budget will allow them entry, and identify a suitable property for them to purchase don't worry about specific properties available, simply identify what an 'ideal' property for them to purchase would look like (i.e., identify the neighborhood, and general features such as number of bedrooms, number of bathrooms, is it a condo, asking price, etc.) - Using the New York city Airbnb data, they've provided you, build a modelfor predicting onenight rates for properties in the city - Your folks will also use the property for themselves and guests for up to five weeks of every year, and therefore, they would like to purchase in a safe neighborhood - You will have to collect crime statistics for the city and include this in your modeling for price prediction - Also prepare a separate analysis (data exploration) of the crime in the city - Finally, your parents are not very tech savvy, and would like you to present your findings in a formal report for them to review