Question: SIG731 Task 4P Working with pandas Data Frames Heterogeneous Data 1 Introduction This task is related to Module 4 see the Learning Resources on the

SIG731 Task 4P Working with pandas Data Frames Heterogeneous Data 1 Introduction This task is related to Module 4 see the Learning Resources on the unit site see also Chapters 10 11 12 16 of Minimalist Data Wrangling with Python This task is due on Week 5 2nd Feb Sunday However ideallyyou should complete this task by the end of Week 4 Hence start tackling it as early as possible If we find your first solution incomplete or otherwise incorrect you will still be able to amend it based on the generous feedback we will give you allow 3 5 working days In case of any problemsquestions do not hesitate to attend our on campusonline classes or use the Discussion Board on the unit site Submitting after the aforementioned due date might incur a late penalty The cutoff date is Week 6 Sunday There will be no extensions this is a Week 4 task after all and no solutions will be accepted thereafter At that time if your submission is not 100 complete it will be marked as FAIL without the possibility of correcting and resubmitting This task is part of the hurdle requirements in this unit Not submitting the correct version on time results in failing the unit All submissions will be checked for plagiarism You are expected to work independently on your task solutions Never shareshow parts of solutions withto anyone 2 Questions Download the nycflights13weathercsvgz data file from our unit site Learning Resources Data It gives the hourly meteorological data for three airports in New York LGA JFK and EWR for the whole year of 2013 The columns are origin weather station LGA JFK or EWR year month day hour time of recording temp dewp temperature and dew point in degrees Fahrenheit humid relative humidity winddir windspeed windgust wind direction in degrees speed and gust speed in mph precip precipitation in inches pressure sea level pressure in millibars visib visibility in miles timehour date and hour based on the year month day hour fields formatted as YYYY mmdd HHMMSS actually YYYYmmdd HH0000 However due to a bug in the dataset the data in this column are incorrectly shifted by 1 hour Do not rely on it unless you manually correct it Then create a single JupyterIPython notebook see the Artefacts section below for all the requirements where you perform what follows Q1 Convert all columns so that they use metric International System of Units SI or derived units temp and dewp to Celsius precip to millimetres visib to metres as well as windspeed and windgust to metres per second Replace the data inplace overwrite existing columns with new ones Q2 Compute daily mean wind speeds for the LGA airport 365 total speed values for each day separately you can for example group the data by year month and day at the same time Q3 Present the daily mean wind speeds at LGA 365 aforementioned data points in a single plot eg using the matplotlibpyplotplot function The xaxis labels should be human readable and intuitive eg month names or dates Reference result 201301 201303 201305 201307 201309 201311 201401 day Q4 Identify the ten windiest days at LGA dates and the corresponding mean daily wind speeds Reference result

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!