Question: # starter code import requests from bs4 import BeautifulSoup import pandas as pd Please dont import anything else, it is not allowed for the assignment.
# starter code
import requests from bs4 import BeautifulSoup import pandas as pd
Please dont import anything else, it is not allowed for the assignment.

My previous code (it scrapes product information for books in multiple genres but only for the first page):
url = "https://books.toscrape.com/" genres = ["Travel", "Mystery", "Historical Fiction", "Sequential Art", "Classics", "Philosophy"] data3=[] response=requests.get(url, timeout=3) soup = BeautifulSoup(response.content, 'html.parser') genre_url_elems = soup.select(".side_categories>ul>li>ul>li>a") ###was stuck here for a while, got help from stack overflow, link:https://stackoverflow.com/questions/75191258/having-trouble-extracting-the-url-from-a-website genre_urls = [a['href'] for a in genre_url_elems]
for a in genre_urls[0:6]: new_url=url+a response2=requests.get(new_url) soup2=BeautifulSoup(response2.content, 'html.parser') body2=soup2.find_all('article', {'class': 'product_pod'}) for b in body2: imageurl=b.find('a').get('href') title=b.find('h3').find('a').get('title') price=b.find('div',{'class':'product_price'}).find('class'=='product_color').get_text() rating=b.find('p')['class'][-1] data3.append([title,imageurl,price,rating,a[25:-13]]) df3=pd.DataFrame(data3,columns=['book title','image url','price','rating','genre']) df3.index+=1 ###just didn't want the index to start from 0 again df3
Extend your code further to enable it to process multi-page results. - Tips: For a non-existent web page, the response from the website contains a status code, which can be accessed and tested as follows: \[ \text { response. } \text { status_code }==404 \]
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
