Question: For this phase, your program will accept two file names from the command line; the first is the name of the input file and the
For this phase, your program will accept two file names from the command line; the first is the name of the input file and the second is the name of the output file. When submitting, submit just the .java code(s), any associated documents describing how to deploy and use your program and an example of input and output that demonstrates your program.
Now, for the details:
For this phase, your input file entered will be a list of URLs that you will provide. The lines of input in the file should be of webpages that end in a mix of .htm (or .html), .jpg (or .jpeg), .txt and .pdf file endings. If you have problems finding appropriate examples, ask me after class; but, first try search engines!
For each URL you are processing, obtain URLInfo (as we discussed in class; if find other info that you can obtain, please do tell me about them). Output the name of the URL and all of its info about the URL to an output file (whose name will be the second parameter on the command line you pass to java; the first is name of the input file above). The remaining processing will depend on the file ending that the URL points to (ie that you will be reading from). If the URL is a .html or .htm or .txt file, then read all the lines of the file and save it to a separate file on your system with the Same Name as the URL file name (ie last portion of actual URL). Then, output the number of lines read in and the name of the file to the output file (again this is referred to by the second parameter on the command line). If the URL is an image file, .jpeg or .jpg or .gif, then save the image file to your computer with the Same Name as the URL file name. Then, output the name of the file to the output file. If the URL is .pdf then save the file to your computer with the Same Name as the URL file name and output the name of the file to the output file. In the case of .pdf you will probably need to copy byte by byte instead of line by line because parts of .pdf files are stored in binary encodings and not ascii.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
