Question: Need help with Python problem: Write a class ListParser that is a subclass of the HTMLParser class. It will find and collect the contents of
Need help with Python problem:
Write a class ListParser that is a subclass of the HTMLParser class. It will find and collect the contents of all the list items, both ordered and unordered, in an HTML file fed into it. The parser works by identifying and remembering, via a class variable, when a list item tag has been encountered. When the data handler for the class is called and the class variable indicates that a list item is currently open, the data in the list item is added to a list in the class. When the list item tag is closed, the class adjusts the internal variable to register this. To implement this parser you will need to override the following methods of the HTMLParser class:
__init__: the constructor should call the constructor for the HTMLParser class and create and initialize the necessary class variables
handle_starttag: If the tag that resulted in the method being called is a list item, the appropriate class variable should be set.
handle_endtag: If the tag that resulted in the method being called is a list item, the appropriate class variable should be unset.
handle_data: If the parser is currently inside a list item, the data should be added to the list of list items in the class. Strip any extra spaces or newlines off the contents of the list item before appending it to the class variable.
getItems: Returns the list of list items collected by the class.
The following shows what the test function would display on some sample web pages:

Python 3.4.1 Shell File Edit Shell Debug Options Windows Help test Parser http://facweb.cdma. depaul .edu/ asettle/csc242 Web/list 1 html Item 1 Item 2 Item A Item B Item B1 Item B2 Item B3 Item C X', 'Y' my L testLParser http://facweb.cdma. depaul .edu/ asettle/csc242 web/ list2. html myL C' Cat Dog Hermit crab Java C++ Lisp Scheme Python English German Finnish Spanish Work days Monday Montag maanantai Weekend Saturday Samstag lauantai' J test Parser http: depaul .edu/ asettle/csc2 42/web/test html Ln: 52 Col: 4
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
