Question: Create the DataFrame to use in the following data: df = pd.DataFrame({'From_To': ['LoNDon_paris', 'MAdrid_miLAN', 'londON_StockhOlm', 'Budapest_PaRis', 'Brussels_londOn'], 'FlightNumber': [10045, np.nan, 10065, np.nan, 10085], 'RecentDelays': [[23,

Create the DataFrame to use in the following data:

df = pd.DataFrame({'From_To': ['LoNDon_paris', 'MAdrid_miLAN',

'londON_StockhOlm',

'Budapest_PaRis', 'Brussels_londOn'],

'FlightNumber': [10045, np.nan, 10065, np.nan, 10085],

'RecentDelays': [[23, 47], [], [24, 43, 87], [13], [67, 32]],

'Airline': ['KLM(!)', ' (12)', '(British Airways. )',

'12. Air France', '"Swiss Air"']})

Solve the following issues.

1. Some values in the the FlightNumber column are missing. These numbers are meant to increase by 10 with each row so 10055 and 10075 need to be put in place. Fill in these missing numbers and make the column an integer column (instead of a float column).

2. The From_To column would be better as two separate columns! Split each string on the underscore delimiter _ to give a new temporary DataFrame with the correct values. Assign the correct column names to this temporary DataFrame.

3. Notice how the capitalisation of the city names is all mixed up in this temporary DataFrame. Standardise the strings so that only the first letter is uppercase (e.g. "londON" should become "London".)

4. Delete the From_To column from df and attach the temporary DataFrame from the previous questions.

5. In the RecentDelays column, the values have been entered into the DataFrame as a list. We would like each first value in its own column, each

second value in its own column, and so on. If there isn't an Nth value, the value should be NaN.

Expand the Series of lists into a DataFrame named delays, rename the columns delay_1, delay_2, etc. and replace the unwanted RecentDelays column in df with delays.

import pandas as pd

import numpy as np

df = pd.DataFrame({'From_To': ['LoNDon_paris', 'MAdrid_miLAN',

'londON_StockhOlm',

'Budapest_PaRis', 'Brussels_londOn'],

'FlightNumber': [10045, np.nan, 10065, np.nan, 10085],

'RecentDelays': [[23, 47], [], [24, 43, 87], [13], [67, 32]],

'Airline': ['KLM(!)', ' (12)', '(British Airways. )',

'12. Air France', '"Swiss Air"']})

Jupyter notebook

It happens all the time: someone gives you data containing malformed strings, Python, lists and missing data. How do you tidy it up so you can get on with the analysis?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Read the case study "Southwest Airlines," found in Part 2 of your textbook. Review the "Guide to Case Analysis" found on pp. CA1 - CA11 of your textbook. (This guide follows the last case in the...

The following additional information is available for the Dr. Ivan and Irene Incisor family from Chapters 1-5. Ivan's grandfather died and left a portfolio of municipal bonds. In 2012, they pay Ivan...

The following data are from Gennings, Chinchilli, and Carter, (1989). An in vitro toxicity study of isolated hepatocyte suspensions was conducted to study the impact of combining carbon tetrachloride...

Why do the requirements drift once a project is under way?

Three capacitors, C1 = 5.00 F, C2 = 4.00 F, and C3 = 9.00 F, are connected together. Find the effective capacitance of the group (a) If they are all in parallel, and (b) If they are all in series.

A particularly important component of the initial planning process is analysis.

Describe the major focus of Frankls logotherapy.

The management of Corbin Company has decided to develop cost formulas for its major overhead activities. Corbin uses a highly automated manufacturing process, and power costs are a significant...

Part 3 ( 1 point ) Select only individual atoms, and not the bonds between them. Double bonds may be conjugated, meaning that there is one single bond between them. Identify all the atoms in the...

Identify all of the functional groups in each of the following compounds: (a) (b) (c) (d) (e) (f) (g) Vitamin D3 HO OMe Aspartame O NH2 NH2 Amphetamine Me Cholesterol HO OCH2CH3 Demerol CH A...

PLEASE ANSWER ALL THE PARTS IN DETAILS A well-insulated container consists of two halves of equal volumes V separated by a partition. On one half, there are N moles of a monatomic ideal gas at...

Below are the transactions for Button Sewing Shop for March, the first month of operations. March 1 Issue common stock in exchange for cash of $2,600. March 3 March 5 March 7 Purchase sewing...

What idea justifies employers being conscious of employees working too many long hours? a.) Employees working long hours can be counterproductive to a firm. b.) Constant disengagement by employees...

CC13-1 Evaluating Profitability, Liquidity, and Solvency [LO4, L05) Looking back over the past few years, it is clear that Nicole Mackisey has accomplished a lot running her business, Nicole's...

" 2.7 HW X + es/24606/assignments/509305 Megan is an industrial engineer for Coyner Cola Company. She takes a random sample of cola cans from the production line each day to determine if the product...

I need some help with question 8 of case 1. The book tilte is: Cases in Financial Management. Authors: John S Dunkelberg and Joseph M Sulock B. (a) Suppose Studebaker's goal is to accumulate $400,000...

is this a good deal for investment 10% :COFFEE SHOP 5% :EACHCHAPTER Question: is 7his a COOODEAL

QUESTION 9 HC-O-C-R R-C-O-CH HC-O-P-O-CH-CH-NH3* O || O a. Phosphatidic acid, Serine O b. Lysophosphatidic acid, Serine, Free FA O c. Lysophosphatidylserine, Free FA O d. 2 Free FAs, Serine, Glycerol...

The coefficients in Equation (7.9) were calculated when a successful launch was given a value of 1. Conduct a logistic regression analysis where 1 indicates an O-ring failure and 0 represents a...

Since Bags are the units in this study, the Bag effect is the same as a residual effect. Create a table similar to Table 5.1 to calculate the residual effect by subtracting all other effects from...

Explain why you might expect the MSE to be smaller in Question 2 than in Question 1.

Advertising transforms a product into a mirror of its consumer. Explain this in terms of branding.

Brands enable consumers to communicate something about themselves. Explain how this might work.

If a corporation has both consumer and industrial products, to what extent can their marketing communication activities be consolidated? Illustrate with examples.