Question: import requests, re, json from pprint import pprint from bs4 import BeautifulSoup def acotar(website): Question 7 - Webscraping You will be webscraping the wikipedia

import requests, re, json from pprint import pprint from bs4 import BeautifulSoup

def acotar(website): """ Question 7 - Webscraping

You will be webscraping the wikipedia page for A Court of Thorns and Roses by Sarah J. Maas. You will be retrieving the data from the table under the books header.

https://en.wikipedia.org/wiki/A_Court_of_Thorns_and_Roses

You will create a dictionary of dictionaries. The main key will be the number in the series that the book is. Each sub-dictionary should have the keys "ISBN", "Publication Date", "ISBN", "Synopsis", and "Title" corresponding to the data found in the table. The synopsis can be found in the row directly beneath the rest of the book data.

Cleaning: - For the ISBN number you should include only the number remove any letters and characters preceeding the number. It should begin 978 and contain 13 digits - For the synopsis make sure to remove any newline or other characters - For the title and publication date make sure to remove any trailing or leading spaces

HINT: The rows containing the title, publication date, and ISBN number all have the same class. The rows containing the descriptions all have the same class. Zip could be very helpful to make sure that your data stays together.

Args: website (str) - url to a website Returns: dict

>>> acotar("https://en.wikipedia.org/wiki/A_Court_of_Thorns_and_Roses") {1: {'ISBN': '9781619634442', 'Publication Date': 'May 5, 2015', 'Synopsis': 'Nineteen-year-old Feyre kills a wolf in the woods, and a beast-like creature demands punishment for it. She is taken to the land of the faerie by her captor, Tamlin, who is an immortal faerie himself. She comes to live with him at his estate. She comes to learn that he is a High Lord of Prythian, and Feyre realizes that what she has previously learnt about the dangerous world of the faeries is all a lie.', 'Title': 'A Court of Thorns and Roses'}, ...

pprint(acotar("https://en.wikipedia.org/wiki/A_Court_of_Thorns_and_Roses"))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!