Question: Add an optional parameter limit with a default of 10 to crawl() function which is the maximum number of web pages to download . Save

Add an optional parameter limit with a default of 10 to crawl() function which is the maximum number of web pages to download . Save files to pages dir using the MD5 hash of the pages URL and Only crawl URLs that are in landmark.edu domain (*.landmark.edu)

Use a regular expression when examining discovered links . import hashlib filename = 'pages/' + hashlib.md5(url.encode()).hexdigest() + '.html' import re p = re.compile('ab*') if p.match('abc'): print("yes")

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!