Question: Python Programming Use the Python requests library andBeautiful Soup library to create a Python script that scrapes anddisplays the html links and images from the

Python Programming

Use the Python requests library andBeautiful Soup library to create a Python script that “scrapes” anddisplays the html links and images from the home page of theSmithsonian institute. (Si.org)

  • Your program must write each link found to an output file named“weblinks.txt” and each image found on the home page to a file name“webimages.txt”.
  • Any image or link that contains the word “art” must be writtento a file named art.txt. Make sure your programis NOT case sensitive when evaluating the word “art”.

What I have so far-

#http get a file and save in python string variable
#check http code
from bs4 import BeautifulSoup
import urllib.request, urllib.parse, urllib.error
#http response to a site
resp = urllib.request.urlopen('https://www.si.edu/')

soup = BeautifulSoup(resp,"html.parser")
#get a list of anchor tags
tags = soup('a')
print(type(tags))
for item in tags:
print (item.get('href',None))
for item in tags:
if "art" in str(item).lower():
print(item.get('href',None))

#save downloaded file to disk
try:
resp =urllib.request.urlopen('https://www.si.edu/')
bytesToWrite = resp.read()

#must write as binary to maintain unicodeformatting
myFile = open("weblinks.txt",'wb')
myFile.write(bytesToWrite)
myFile.close()

except Exception as exc:
print('An error occured.' + str(exc))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!