Question: The author uses the Case Study 12.13: Web Crawler to demonstrate how to read text from Web pages, parse out URLs and store them in

The author uses the Case Study 12.13: Web Crawler to demonstrate how to read text from Web pages, parse out URLs and store them in "stacks" created with ArrayLists, and crawl (change) to the URLs on each found page.

The author uses the Case Study 12.13: Web Crawler to demonstrate how

This assignment uses the program found on page 485 in your textbooks.

to read text from Web pages, parse out URLs and store them

in "stacks" created with ArrayLists, and crawl (change) to the URLs on

The java.net.URL url = new java.net.URL(urlString) line (36) in the getSubURLs() method actually loads the Web page into your Web browser. We are first going to create a new NetBeans project and copy the program into the Class file. We will then test the program with a specific URL. We will then modify the program to identify the URLs found only in the test URL without actually changing pages as part 2. Make sure you read the Help document for the snippet Assignment. PLEASE NOTE: The page numbers referenced below are from the program listing 12.18 found on page 485 in your textbooks. As part 1, in NetBeans create a project called "{YourLastName}SnippetWeek06 replacing {YourLastName} including the brackets with your Last Name. Let NetBeans create a main() method. Then, copy and paste the WebCrawler.txt file into the class with the main method and replace the class name WebCrawler with the name that matches your class. Execute the program using the test URL:

each found page. This assignment uses the program found on page 485

in your textbooks. The java.net.URL url = new java.net.URL(urlString) line (36) in

Test urls.

the getSubURLs() method actually loads the Web page into your Web browser.

https://www.google.com

https://www.w3schools.com

http://www.msn.com

We are first going to create a new NetBeans project and copy

As part 2, create a project called "{YourLastName}SnippetWeek06Part2 replacing {YourLastName} including the brackets with your Last Name. Let NetBeans create a main() method. Then, copy and paste the WebCrawler.txt file into the class with the main method and replace the class name WebCrawler with the name that matches your class. You are to modify the program to identify the URLs contained just in the page the user enters from the prompt. You are to identify which line the URL appears in (which can be done as a simple variable that is incremented after every line is read), and the iteration (# of times) the repeated URL (if it is repeated that is) is appearing. Lines 24 - 27 are what causes a URL to be loaded as the source of our input. Comment them out and "getSubURLs()" will not be executed. Then make some modifications to perform the printing as stated above. Make sure you read the Help document for this assignment. For the output, use the same test page referenced above.\

the program into the Class file. We will then test the program

I 2.13 Case Study: Web Crawler This case study develops a program that travels the Web by following hyperlinks. oint The World Wide Web, abbreviated as www. W3, or Web, is a system of interlinked hyper- text documents on the Internet. With a Web browser, you can view a document and follow the hyperlinks to view other documents. In this case study, we will develop a program that automatically traverses the documents on the Web by following the hyperlinks. This type of program is commonly known as a Web crawler. For simplicity, our program follows for the hyperlink that starts with http://. Figure 12.11 shows an example of traversing the Web. We start from a Web page that contains three URLs named URLI,URL2, and URL3. Following URLI leads to the page that contains three URLs named URLll, URL12, and URL13. Follow ing URL2 leads to the page that contains two URLs named URL21and URL22. Following URL3 leads to the page that contains four URLs named URL31, URL32, and URL33, and URL34. Continue to traverse the Web following the new hyperlinks. As you see, this process may continue forever, but we will exit the program once we have traversed 100 pages. Starting URL URLI URL2 URL3 URL1 URL2 URL3 UR121 URL31 URL11. URL12 URL32 UR133 URL13 URLA FIGURE 12.II The client retrieves files from a Web server. The program follows the URLs to traverse the Web. To ensure that each URL is traversed only once, the program maintains two lists of URLs. One list stores the URLs pending for traversing and the other stores the URLs that have already been traversed. The algorithm for this program can be described as follows: Add the starting URL to a list named listofPendingURLs; while listofPendingURLs is not empty and size of listofTraversedURLs 100

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

i want complete solution for my assignment and it should be without plagiarism COIT20274: Information Systems for Business Professionals, Term One 2016 Assignments 1 & 2 Requirements Assignment 1 -...

Instructions for submission One of the topics covered in Analysis of Algorithms are algorithms for traversing graphs. The structure of the world-wide-web is an example of a directed graph with each...

I have to create a program in C and I can't figure it out. The program has to read a source file. Please help. /******************************************************************** PROJECT: Glossary...

After having the opportunity to complete the course, what would you change and why? What topic particularly caught your interest and what do you want to know more about? Last, but not least, if you...

"READ AND CRITICALLY ANALYZE THE ARTICLE" (could you please make it at least five sentences each paragraph without just repeating question) (please follow the formatting) THANK YOU ! FIRST PARAGRAPH...

Explain this chapter and some main points. Reference: In this chapter we will explain about the function of each part of the final text of the Research and we will give some writing suggestions. 4.1...

The Final Paper demonstrate of the reading as well as the discussion ponts. The purpose of this assignment is to the concepts studied in this into a cohesive and paper. The Final Paperwill be titled...

Rev.Confirming Pages C H A P T E R 7 Planning, Composing, and Revising Chapter Outline The Ways Good Writers Write Activities in the Composing Process Using Your Time Effectively Brainstorming,...

If f (1) = 10 and f(x) > 2 for 1

Determine the electrical conductivity of ZnO at 600C when doped with 0.008% As.

5. How would you improve upon the way the Bureau of Labor Statistics computes the unemployment rate?

Intro An American put option on Citigroup stock has a strike price of $70 and expires in 0.9 years. Citigroup stock has a current market price of $67.13 and the risk- free rate is 6%. | Attempt 1/1...

6. Have you used solid reasoning in your argument?

8. Have you checked (and rechecked) your speech for logical fallacies? Have you addressed and removed any that youve identified?

3. Have you assessed your audiences disposition and needs? Have you considered what is most relevant to them?