Question:
Rewrite Listing 12.18, WebCrawler.java, to improve the performance by using appropriate new data structures for listOfPendingURLs and listofTraversedURLs.
Suppose you have Java source files under the directories?chapter1,?chapter2, . . . ,?chapter34. Write a program to insert the statement?package chapteri;?as the first line for each Java source file under the directory?chapteri. Suppose?chapter1,?chapter2, . . . ,?chapter34?are under the root directory?srcRootDirectory. The root directory and?chapteri?directory may contain other folders and files. Use the following command to run the program:
java Exercise12_18 srcRootDirectory
Listing?WebCrawler.java
Transcribed Image Text:
import java.util.Scanner; 2 import java.util.ArrayList; 4 public class WebCrawler { public static void main(String [] args) { java.util.Scanner input = new java.util.Scanner(System.in); System.out.print("Enter a URL: "); String url = input.nextline(); crawler(url); / Traverse the Web from the a starting url enter a URL crawl from this URL 10 11 12 public static void crawler(String startingURL) { Arraylist list0fPendingURLS = new ArrayList<>(); ArrayList listofTraversedURLS = new ArrayList<>(); 13 list of pending URLS list of traversed URLSS 14 15 16 list0fPendingURLS.add(startingURL); while (!listOfPendingURLs.jsEmptyO && listofTraversedURLS.size() <= 100) { String urlString = listofPendingURLS.remove(0); if (!list0fTraversedURLs.contains (urlString)) { listofTraversedURLs.add(ur1String); System.out.printīn("Crawl " + uriString); add starting URL 17 18 19 get the first URL 20 21 URL traversed 22 23 24 25 for (String s: getSubURLS(ur1String)) { if (!list0fTraversedURLs.contains(s)) listofPendingURLs.add(s); 26 add a newURL 27 28 29 30 31 32 public static ArrayList getSubURLS (String urlString) { ArrayList list = new ArrayList<>(); 33 34 try { java.net.URL url = new java.net.URL (urlString); Scanner input - new Scanner(url.openStream()); int current = 0; while (input.hasNext()) { String line = input.nextLine(); current = line.index0f ("http:", current); while (current > 0) { int endIndex = line.index0fC"\"", current); if (endIndex > 0) { // Ensure that a correct URL is found list.add(line.substring(current, endIndex)); current - line.index0f ("http:", endIndex); 35 36 37 38 39 40 read a line 41 search for a URL 42 end of a URL 43 44 URL ends with 45 46 extract a URL scarch for next URL. 47 1234 n67CO else 48 current = -1; 50 51 52 53 54 catch (Exception ex) { System.out.println("Error: + ex.getMessage()); 55 56 return URLS 57 58 return list; 59 }