Question: Add an optional parameter limit with a default of 10 to crawl() function which is the maximum number of web pages to download . Save
Add an optional parameter limit with a default of 10 to crawl() function which is the maximum number of web pages to download . Save files to pages dir using the MD5 hash of the pages URL and Only crawl URLs that are in landmark.edu domain (*.landmark.edu)
Use a regular expression when examining discovered links . import hashlib filename = 'pages/' + hashlib.md5(url.encode()).hexdigest() + '.html' import re p = re.compile('ab*') if p.match('abc'): print("yes")
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
