Question: Using python!!!! 1. Copy the file web.py from class (or class notes) into your working folder. 2. Include the following imports at the top of

Using python!!!!

1. Copy the file web.py from class (or class notes) into your working folder.

2. Include the following imports at the top of your module (hopefully this is sufficient):

from web import LinkCollector # make sure you did 1 from html.parser import HTMLParser from urllib.request import urlopen from urllib.parse import urljoin from urllib.error import URLError

3. Implement a class ImageCollector. This will be similar to the LinkCollector, given a string containing the html for a web page, it collects and is able to supply the (absolute) urls of the images on that web page. They should be collected in a set that can be retrieved with the method getImages (order of images will vary). Sample usage:

>>> ic = ImageCollector('http://www2.warnerbros.com/spacejam/movie/jam.htm') >>> ic.feed( urlopen('http://www2.warnerbros.com/spacejam/movie/jam.htm').read().decode()) >>> ic.getImages() {'http://www2.warnerbros.com/spacejam/movie/img/p-sitemap.gif', , 'http://www2.warnerbros.com/spacejam/movie/img/p-jamcentral.gif'}

>>> ic = ImageCollector('http://www.kli.org/') >>> ic.feed( urlopen('http://www.kli.org/').read().decode())

>>> ic.getImages() {'http://www.kli.org/wp-content/uploads/2014/03/KLIbutton.gif', 'http://www.kli.org/wp-content/uploads/2014/03/KLIlogo.gif'}

4. Implement a class ImageCrawler that will inherit from the Crawler developed in amd will both crawl links and collect images. This is very easy by inheriting from and extending the Crawler class. You will need to collect images in a set. Hint: what does it mean to extend? Implementation details:

a. You must inherit from Crawler. Make sure that the module web.py is in your working folder and make sure that you import Crawler from the web module.

b. __init__ - extends Crawlers __init__ by adding an set attribute that will be used to store images

c. Crawl extends Crawlers crawl by creating an image collector, opening the url and then collecting any images from the url in the set of images being stored. I recommend that you collect the images before you call the Crawlers crawl method.

d. getImages returns the set of images collected

>>> c = ImageCrawler() >>> c.crawl('http://www2.warnerbros.com/spacejam/movie/jam.htm',1,True) >>> c.getImages() {'http://www2.warnerbros.com/spacejam/movie/img/p-lunartunes.gif', 'http://www2.warnerbros.com/spacejam/movie/cmp/pressbox/img/r-blue.gif'}

>>> c = ImageCrawler() >>> c.crawl('http://www.pmichaud.com/toast/',1,True) >>> c.getImages() {'http://www.pmichaud.com/toast/toast-6a.gif', 'http://www.pmichaud.com/toast/toast-2c.gif', 'http://www.pmichaud.com/toast/toast-4c.gif', 'http://www.pmichaud.com/toast/toast-6c.gif', 'http://www.pmichaud.com/toast/ptart-1c.gif', 'http://www.pmichaud.com/toast/toast-7b.gif', 'http://www.pmichaud.com/toast/krnbo24.gif', 'http://www.pmichaud.com/toast/toast-1b.gif', 'http://www.pmichaud.com/toast/toast-3c.gif', 'http://www.pmichaud.com/toast/toast-5c.gif', 'http://www.pmichaud.com/toast/toast-8a.gif'}

5. Implement a function scrapeImages: Given a url, a filename, a depth, and Boolean (relativeOnly)., this function starts at url, crawls to depth, collects images, and then writes an html document containing the images to filename. This is not hard, use the ImageCrawler from the prior step. For example:

>>> scrapeImages('http://www2.warnerbros.com/spacejam/

movie/jam.htm','jam.html',1,True) >>> open('jam.html').read().count('img') 62

>>> scrapeImages('http://www.pmichaud.com/toast/', 'toast.html',1,True) >>> open('toast.html').read().count('img') 11

link to web.py https://www.dropbox.com/s/obiyi7lnwc3rw0d/web.py?dl=0

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

explan the code can i get you email to send the file? Lab 01 Chapter 1 : Algorithms, Errors, and Testing Chapter 2:Java Fundamentals Lab Objectives By completion of the lab the students should be...

Using Python 3+ 1. Include the following imports at the top of your module (hopefully this is sufficient): from web import LinkCollector # make sure you did 1 from html.parser import HTMLParser from...

Program Organization Create a project5 folder somewhere on a flash drive, onedrive or on your computer. Name your main python file lettergrade-yourlastnome.py, and use your real last name. The...

In this question we will introduce you to turtle graphics. A turtle is a cute name for an entity that we can move around a drawing canvas to create pictures. The turtle has a pen which can be in...

DAT 520 Problem Set 4 Introductory Decision Trees This module we leave the probability exercises and move into hands-on applications of the prior module's lessons. There are three exercises, each...

A dragon curve is a fascinating mathematical object with many interesting properties. We're mostly going to use it as an exercise to draw pictures using turtle graphics, but if you want to learn...

Given files: 1)Starter file: # Dragons! # This is a starter file! # Look for comments indicating where you will do your work. # Python modules import turtle as turtle # Module supplied by CMPT 141...

need help with the following program reach out if you need more information. . the trace I was given the classes I was given and need help coding operator operand neg operator multiplication main...

Jones & Bartlett Learning, LLC. NOT FOR RESALE OR DISTRIBUTION CHAPTER Hot Spot Analysis 10 LEARNING OBJECTIVES C A R R Provide a working definition of a \"hot spot.\" , Be able to explain different...

The auditor has documented the following internal control procedures used by the client. For each procedure, explain the test you would perform to determine whether the control was working and the...

On April 2, 2014, LiveReel Media purchased for $588,000 machinery having an estimated useful life of 10 years with an estimated residual value of $56,000. The companys year-end is December 31....

23. Prosecutors representing the Securities and Exchange Commission recently announced criminal charges against 13 individuals for engaging in insider trading. According to the SECs director of...

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

Assume that the banking system has total reserves of $100 billion. Assume also that required reserves are 10 percent of checking deposits and that banks hold no excess reserves and households hold no...

As shown in Figure 3, the overall labor-force participation rate of men declined between 1970 and 2000. At the same time, the labor-force participation rate of women increased sharply. This overall...

The Bureau of Labor Statistics announced that in February 2008, of all adult Americans, 145,993,000 were employed, 7,381,000 were unemployed, and 79,436,000 were not in the labor force. Use this...