Question: I want to create a class named HeadingParser that can be used to parse an HTML document, and retrieve and print all the headings in
I want to create a class named HeadingParser that can be used to parse an HTML document, and retrieve and print all the headings in the document. I need to implement my class as a subclass of HTMLParser, defined in Standard Library module html.parser. When fed a string containing HTML code, my class should print the headings, one per line and in the order in which they appear in the output. Each heading should be indented as follows: an h1 heading should have indentation 0, and h2 heading should have indentation 1, etc. And I have attached the file which I m using .
>>> infile = open('w3c.html')
>>> content = infile.read()
>>> infile.close()
>>> hp = HeadingParser()
>>> hp.feed(content)
Output:-
W3C Mission Principles
The text in w3c.html file:-
<html> <head> <title>W3C Mission Summarytitle> head> <body> <h1>W3C Missionh1> <p> The W3C mission is to lead the World Wide Web to its full potential<br> by developing protocols and guidelines that ensure the long-term growth of the Web. p> <h2>Principlesh2> <ul> <li>Web for Allli> <li>Web on Everythingli> ul> See the complete <a href="http://www.w3.org/Consortium/mission.html">W3C Mission documenta>. body> html>
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
