Question: Question 3 For this question, find the top three airline names which have high number of flights and the least percentage of delay compared to

Question 3

For this question, find the top three airline names which have high number of flights and the least percentage of delay compared to other airlines. The result should be a dataframe which has three columns AIRLINE_NAME, NUM_FLIGHTS and PERC_DELAY.

BOTH dataframe share the same IATA_CODE and AIRLINE

NOTE: There is no columns named AIRLINE_NAME and PERC_DELAY so you have create a new columns

Hint:

percentage of delay for each airline is obtained using groupby and apply methods

merge flights_df with airlines_df to get the names of top three airlines

def top_three_airlines(flights_df, airlines_df): # YOUR CODE HERE raise NotImplementedError() return df

top_three_airlines_df = top_three_airlines(flights_df_raw.copy(), airlines_df.copy())

assert sorted(list(top_three_airlines_df.columns)) == sorted(['NUM_FLIGHTS', 'PERC_DELAY', 'AIRLINE_NAME']), "Dataframe doesn't have required columns" assert top_three_airlines_df.loc[0, 'AIRLINE_NAME'] == 'United Air Lines Inc.', "Top airline name doesn't match"

Question 4

For this question, obtain the monthly percentage of delays for each ORIGIN_AIRPORT.

Example Result:

 MONTH BOS JFK LAX SFO 0 January 0.1902 0.2257 0.1738 0.xxxx 1 February 0.3248 0.xxxx 0.xxxx 0.xxxx 2 March 0.1984 0.xxxx 0.xxxx 0.xxxx 3 April 0.xxxx 0.xxxx 0.xxxx 0.xxxx

def monthly_airport_delays(flights_df): # YOUR CODE HERE raise NotImplementedError() return df

monthly_airport_delays_df = monthly_airport_delays(flights_df_raw.copy())

I would like to add the csv files but can't.

I need help with this assignment.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!