Question: C++ This C++ assignment is to extract texts in a webpage andsave them in a text file with predefined formats. Your programshould use the following
C++
"This C++ assignment is to extract texts in a webpage andsave them in a text file with predefined formats. Your programshould use the following table to convert a HTML file into a puretext file"
*Here is the required sample .html file:http://www.mediafire.com/file/48jgo660pkb3qbj/Sample.html

One simple way to extract the text in a webpage is to remove all HTML tags enclosed by pairs. However, the extracted text will be a long character string. This project is to extract texts in a webpage and save them in a text file with predefined formats. Your program should use the following table to convert a HTML file into a pure text file. HTML Tags Page title Headings (heading 3 to 6 use one '#') Paragraphs Line breaks Unordered lists Ordered lists HTML example My webpage title Links Heading 1 Heading 2 Heading 3 This is paragraph 1. | A blank line before and after the paragraph. A new line Bullet One Bullet Two Bullet three Item One Item Two Item three Converted text === My webpage title Link Description # Heading 1 ## Heading 2 ### Heading 3 Bullet One Bullet Two Bullet Three 1. Item One 2. Item Two 3. Item Three Link Description (http://www.abc.com) In this project, you can define your own styles for the tags not in the above table, and/or ignore more complicated tags (e.g. tables). Report You should submit your C++ code and a report with captured screens of HTML files, HTML sources and formatted text files. A sample HTML file (sample.html) was uploaded to Blackboard for you to test your output.
Step by Step Solution
3.31 Rating (154 Votes )
There are 3 Steps involved in it
Below is a simple C program that reads an HTML file extracts text based on the provided table and saves the formatted text into a new text file includ... View full answer
Get step-by-step solutions from verified subject matter experts
