Write a C++ program for extracting the text from a webpage. Implement 4 functions, described below. Code
Fantastic news! We've Found the answer you've been seeking!
Question:
Write a C++ program for extracting the text from a webpage. Implement 4 functions, described below. Code should follow C++ conventions, and it should not use any libraries other than those we have discussed in class so far.
- string readFile(string filename)
- This function opens the file with the given name, reads in the entirety of the file, and returns a string containing the file contents.
- It should return the empty string if the file doesn't exist.
- string extractParagraphs(string content)
- This function should return all of the paragraph contents in the given HTML content, and each paragraph should be followed by two new lines (). Paragraphs in HTML start with a paragraph start tag (
) and end with a paragraph end tag (
), and the contents are between these two tags. The output should not include any of the start or end tags.
- This function should return all of the paragraph contents in the given HTML content, and each paragraph should be followed by two new lines (). Paragraphs in HTML start with a paragraph start tag (
- string removeTags(string content)
- This function should search the given HTML content and remove all of the HTML tags. All HTML tags start with a less than sign (<) and end with a greater than sign (>), and anything that begins with a less than sign and ends with a greater than sign is an HTML tag.
- int main()
- main() should use the other 3 functions to read in the contents of input.html, extract all of the paragraphs in this file, remove the tags from the paragraphs, and print the result to cout.
Related Book For
Income Tax Fundamentals 2013
ISBN: 9781285586618
31st Edition
Authors: Gerald E. Whittenburg, Martha Altus Buller, Steven L Gill
Posted Date: