Question: You will construct a simple tokenizer class called URLLexer.java, which takes an array of regular expression strings (one per token category, in the exact order

You will construct a simple tokenizer class called URLLexer.java, which takes an array of regular expression strings (one per token category, in the exact order given above) and a string to tokenize. The class will implement the following methods:

The constructor sets up the tokenizer given the regular expressions.

reset(string) resets the tokenizer to the beginning of string, and sets up any other variables you may need to keep track of, such as current position in the input, the matching index for the token, etc.

nextToken() provides the next token, else null if no more tokens, or when it encounters text in the string which cant be tokenized by any of the regular expressions provided.

getMatchingIndex() returns the index into the array of regular expression strings which matched the token which had recently been returned by nextToken()

getPosition() returns current position in the token stream where the next token will be extracted.

main(...) will do the primary code as described later

Your main(...) function will work as follows. You will repeatedly request a URL by printing URL: . Once the user has provided a URL, you will trim it of whitespace, then tokenize it. As you tokenize it you will print out the tokens one by one, including their token types. If you find a duplicate token type, you will FAIL. You will also FAIL if the tokenizer cannot recognize any further tokens but you still have characters left to tokenize. If you manage to finish tokenizing a URL, you will pass the tokens to the fetch(...) function provided below. Whenever a failure occurs, you will indicate it, then loop again to request another URL.

import java.util.regex.*; import java.io.*; import java.util.*; import java.net.*; public class URLLexer { // These are the 7 tokens in our simplified URL definition public static final int PROTOCOL = 0; public static final int NUMERICAL_ADDRESS = 1; public static final int NON_NUMERICAL_ADDRESS = 2; public static final int PORT = 3; public static final int FILE = 4; public static final int FRAGMENT = 5; public static final int QUERY = 6; // Here you place regular expressions, one per token. Each is a string. public static final String[] REGULAR_EXPRESSION = new String[] { "Not Defined Yet", // protocol "Not Defined Yet. This one will be very long.", // numerical address "Not Defined Yet", // non-numerical address "Not Defined Yet", // port "Not Defined Yet", // file "Not Defined Yet", // fragment "Not Defined Yet", // query }; // This is an array of names for each of the tokens, which might be convenient for you to // use to print out stuff. public static final String[] NAME = new String[] { "protocol", "numerical address", "non-numerical address", "port", "file", "fragment", "query" }; /** Creates a Blank URLLexer set up to do pattern-matching on the given regular expressions. */ public URLLexer() { // IMPLEMENT ME (ABOUT 5 LINES) } /** Resets the URLLexer to a new string as input. */ public void reset(String input) { // IMPLEMENT ME (ABOUT 3 LINES) } public int getMatchingIndex() { // IMPLEMENT ME (ABOUT 1 LINE) } public int getPosition() { // IMPLEMENT ME (ABOUT 1 LINE) } public String nextToken() { // IMPLEMENT ME (ABOUT 10 LINES) } public static void main(String[] args) throws IOException { // IMPLEMENT ME. // // You will repeatedly request a URL by printing "URL: ". Once the user has provided // a URL, you will trim it of whitespace, then tokenize it. As you tokenize it you // will print out the tokens one // by one, including their token types. If you find a duplicate token type, you will // FAIL. You will also FAIL if the tokenizer cannot recognize any further tokens but // you still have characters left to tokenize. If you manage to finish tokenizing // a URL, you will pass the tokens to the fetch(...) function provided below. Whenever // a failure occurs, you will indicate it, then loop again to request another URL. } // perhaps this function might come in use. // It takes various tokenized values, checks them for validity, then fetches the data // from a URL formed by them and prints it to the screen. public static void fetch(String protocol, String numericalAddress, String nonNumericalAddress, String port, String file, String query, String fragment) { String address = numericalAddress; int iport = 80; // verify the URL if (protocol == null || !protocol.equals("http://")) { System.out.println("ERROR. I don't know how to use protocol " + protocol); } else if (query != null) { System.out.println("ERROR. I'm not smart enough to issue queries, like " + query); } else if (numericalAddress == null && nonNumericalAddress == null) { System.out.println("ERROR. No address was provided."); } else if (numericalAddress != null && nonNumericalAddress != null) { System.out.println("ERROR. Both types of addresses were provided."); } else { if (address == null) { address = nonNumericalAddress; } if (fragment != null) { System.out.println("NOTE. Fragment provided: I will not use it."); } if (port != null) { iport = Integer.parseInt(port.substring(1)); // strip off the ":" } else { System.out.println("NOTE. No port provided, defaulting to port 80."); } if (file == null) { System.out.println("NOTE. No file was provided. Assuming it's just /"); file = "/"; } System.out.println("Downloading ADDRESS: " + address + " PORT: " + iport + " FILE: " + file); System.out.println(" ======================================="); java.io.InputStream stream = null; try { java.net.URL url = new java.net.URL("http", address, iport, file); java.net.URLConnection connection = url.openConnection(); connection.connect(); stream = connection.getInputStream(); final int BUFLEN = 1024; byte[] buffer = new byte[BUFLEN]; while(true) { int len = stream.read(buffer, 0, BUFLEN); if (len <= 0) break; System.out.write(buffer, 0, len); } } catch (java.io.IOException e) { System.out.println("Error fetching data."); } try { if (stream != null) stream.close(); } catch (java.io.IOException e) { } System.out.println(" ======================================="); } } }

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

an operation that yields a N aN value when neither of its arguments is a N aN, (b) an operation with finite arguments that yields +, (c) an operation with an argument + that yields a finite result....

do the following,..... Write program that reads a person's first and last names, separated by a space. Then the program outputs last name, comma, first name. Create program that takes in user input...

re Regular Languages and Finite Automata (a) Let L be the set of all strings over the alphabet {a, b} that end in a and do not contain the substring bb. Describe a deterministic finite automaton...

This is an assignment that needs to be completed in ocaml. Thank you in advance! I have posted the code necessary to complete it below. Just copy it into a .ml file and use a text editor to edit it....

Let A, B be sets. Define: (a) the Cartesian product (A B) (b) the set of relations R between A and B (c) the identity relation A on the set A [3 marks] Suppose S, T are relations between A and B, and...

import java.util.Stack; import java.util.StringTokenizer; public class Evaluator { private Stack operandStack; private Stack operatorStack; private StringTokenizer tokenizer; private static final...

A discrete sequence {xn} can be converted into a continuous representation x(t) = ts X n= (t n ts) xn, where ts is the sampling period. (a) State two characteristic properties of Dirac's function. [2...

In a Hopfield neural network configured as an associative memory, with all of its weights trained and fixed, what three possible behaviours may occur over time in configuration space as the net...

answer the question clearly You are building a flight-control system for which a convincing safety case must be made. Would you assign the tasks of safety requirements engineering, test case...

The goal of this assignment is to create a C program which: -takes a text file as input -tokenizes the given input -Parses the tokenized input to determine if it is grammatically valid This program...

If a seesaw has an adjustable bench, then the board can be positioned over the fulcrum. Maria and Max in Exercise 71 decide to sit on the very edge of the board on each side. Where should the fulcrum...

Alpha Company purchased 30% of the voting common stock of Beta Company on January 1 and paid $500,000 for the investment. Beta Company reported $100,000 of earnings for the year and paid $40,000 in...

the . . . method of reporting cash flows from operating activities reports the cash effect of each operating activity.

"Pull" inventory management systems are characterized by:

9. Refer to the payoff matrix in discussion question 3. First, assume this is a one-time game. Explain how the $60/$57 outcome might be achieved through a credible threat. Next, assume this is a...

1. Why do oligopolies exist? List five or six oligopolists whose products you own or regularly purchase. What distinguishes oligopoly from monopolistic competition? LO14.1

LO14.6 Utilize additional game-theory terminology and demonstrate how to find Nash equilibriums in both simultaneous and sequential games.