Question: Rstudio: Tidyverse package The data used for this assignment comes again from the txhousing data in the ggplot2 package, which contains information on the housing

Rstudio: Tidyverse package
The data used for this assignment comes again from the txhousing data in the ggplot2 package, which contains information on the housing market in Texas. Use ?txhousing to learn about the data. This time, you will be using the entire dataset, so there is no need to modify it before you begin. Create an R Markdown file that provides complete, easy-to-read code, output, and explanations (there are no explanations in this homework) for the following exercises. After exercise 1, each exercise is independent from the others and should start with the full txh dataset created in exercise 1. 1. Create a new dataset, txh, that contains all of the variables in txhousing, plus three more that you will create: a. The txhousing dataset includes the median sale price of all sales in a city in a given month, but it does not include the mean sale price. Create a new variable, mean_price, that is the mean sale price of all sales in a city in a given month, calculated from the total volume and the number of sales. b. The median sale price in a given month will generally be different from the mean_price. Create a new variable, price_dif that is the difference between the two (mean_price - median). c. Create a new variable, sales prop, that calculates the proportion of listings that resulted in sales in a given month. 2. Are there any observations where sales prop is greater than one? List them. 3. Find the total number of sales , the total volume of sales, and the number of cities in this dataset. 4. Find the mean number of sales per month, the median of the median price per month, and the median of the mean_price each month. 5. For each city, find the median price_dif and list them in descending order of their magnitude. 6. Sales vary over the course of a single year. For each city and each year, find the mean number of monthly sales and the median of the price variables: median, mean_price, price_dif. 7. Use what you did in the previous exercise to create a line plot of the average monthly sales per year for each city, using a different color line for each city. 8. Use what you did in exercise 6 to create side-by-side boxplots of the median price_dif for each city
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
