Question: In this project you will be working with R, SQL, and Python in the same document. We will use the data sets airlines and flights

In this project you will be working with R, SQL, and Python in the same document. We will use the data sets airlines and flights from the package nycflights13.

Knit the empty file (to make sure everything is working)

Consistently knit the file each time you answer a question

In R, install the packages nycflights13, sqldf, tydiverse and load all data sets and packages. Take your time to understand the data sets.

```{r message=FALSE, warning=FALSE}

# Install code here - comment the code when you finish installation # Load all packages here # Load the data here

```

Question 1: List the name of airlines where the destination is ATL airport with their average arrival delays and sort them from the smallest to largest average arrival delays. Use data frames flights and airlines.

We shall solve this question using R, SQL, and Python.

R solution

You can use the R command ```inner_join()```to join two tables.

```

# Code here

```

SQL solution

Write your SQL query in the function sqldf(). For exmaple, sqldf("select * from relig_income") list the data frame relig_income.

```{r, warning=FALSE, message=FALSE }

# code here

```

Python solution

Use df.join() from pandas.

Use [df.join()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html) from pandas.

```{python} # load python libraries # load data # code here ```

Question 2: Plot the boxplot of the departure delays vs the name of airlines where the destination is ATL airport. Solve this question using R and Python.

R solution

Use ggplot

```{r, warning=FALSE, message=FALSE }

# code here

```

Python solution

You can use pandas.boxplot() or seaborn.boxplot(). You can either import data from the R package into Python or use the data Python package nycflights13

You can use [pandas.boxplot()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.boxplot.html) or [seaborn.boxplot()](https://seaborn.pydata.org/generated/seaborn.boxplot.html). You can either import data from the R package into Python or use the data Python package [nycflights13](https://pypi.org/project/nycflights13/)

```{python}

# import python libraries # data # Boxplot

```

Question 3: For each airlines, 1) find the month where the average departure delay time is the highest in the year. 2) Make a visualization to show the results. Solve this question using your preferred language R or Python.

```{} # Do not foget to specify which language are you using. # load libraries # load data # code here to find the months # code here to make the visualization ``` 

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!