Question: Install the following package in the virtual environment (venv/) pip install beautifulsoup4 pip install requests pip install pandas pip install Numpy

  1. Install the following package in the virtual environment (venv/)
  2. ■ pip install beautifulsoup4
  3. ■ pip install requests
  4. ■ pip install pandas
  5. ■ pip install Numpy
  6.  
  7. Stage 2: Crawl and Scrape
  8.  
  9. ○ Schulich wants to have an integrated dataset of all Electrical and Engineering department professors in one place. So as a data engineer, you're asked to gather some information about engineering professors by crawling the faculty website of university of calgary. Then, scrape their information and load them to a pandas dataframe and eventually
  10. save it as a csv file.
  11.  
  12. ○ In the first step, you need to get the html text of the website using requests library, and then you must use Beautifulsoup4 library and lxml parser to parse the html and
  13. extract the needed information.
  14. ○ Then, get the html text of the webpage and scrape the information of all its Newest faculty members and professors to put them in a dataframe as presented below:
  15.  
  16. firstname lastname title homepage
  17. ○ Tip: Use `Inspect Element` of Chrome to see the mapping html tags to objects in a webpage
  18.  
  19. ● Stage3: Explore the Data
  20. ○ In this part, iterate on professors' dataframe and request to get their homepage html, and find the phone number and office (building and room) of each professor and add it to your previous dataframe as a new column. Finally, save the dataframe as a csv file in the data directory (uofc_prof.csv).
  21.  
  22. ● Stage4: Generating Report
  23. ○ In this part, you need to generate the following reports:
  24. ■ Number of Assistant Professor
  25. ■ Number of Professor
  26. ■ Number of Senior Instructor
  27. ■ Number of Instructor
  28. ■ Number of Associate Professor

Step by Step Solution

3.45 Rating (145 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

Lets break down the tasks into stages Stage 1 Install Required Packages Open your command prompt and navigate to your project directory Then activate your virtual environment venv if its not already a... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Computer Network Questions!