Question: Hey guys i need help with these two java programs. All help would be greatly appreciated. If you could also comment so I know whats
Hey guys i need help with these two java programs. All help would be greatly appreciated. If you could also comment so I know whats going on exactly that would help a lot. Thanks!
In Java, it is possible to read file from the Internet directly. To do that use: URL url = new URL("http://www.gutenberg.org/files/11/11-0.txt");
Scanner input = new Scanner(url.openStream());
Fre
Problem 1: Write a method that returns the count for each letter (a to z) in a string. The method should run in O(n) time. Ignore letter case, i.e. A is same as a, Z is same as z. Also, ignore non-alphabet characters.
public static void count(String line, int [] counts)
Write also a main
method that reads input from an URL/file, counts the number of characters, and at the end displays the frequency of each character (the count for that character / total number of characters). Place both methods in a class FrequencyAnalysis.java.
The fun part: The file Alice.txt that contains Alice in Wonderland is taken from Project Gutenberg. Run your program with this file.
The frequency of the letters in the English language can be found here:
http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
The output of your program should be close to the frequencies provided in this link (+/- 0.5 difference is OK).
Counting the frequencies of the characters has application in many areas, such as natural language processing, text analysis, security, etc. For example, it can be used for breaking simple substitution cipher:
http://en.wikipedia.org/wiki/Substitution_cipher
Frequency analysis:
http://en.wikipedia.org/wiki/Frequency_analysis
Problem 2 : In this problem, you will go one step further and instead of counting the frequency of the characters you will count the frequency of the words in a file. To accomplish this you will create an ADT Dictionary. To accomplish this, you will first create an inner class Entry that contains a dictionary entry with the following attributes:
word - the word
number of occurrences - the number of times it occurred
Constructors and other methods are up to you to figure out.
The Dictionary class will have an ArrayList of 26 elements (the number of characters in the alphabet). Each element in the ArrayList will be an ArrayList whose elements are of type Entry. Words that begin with the letter a will be stored in the first ArrayList, words that begin with b will be stored in the second ArrayList, etc. In addition, the class should have an attribute total that counts the number of occurrences of all words. You will need this to calculate the frequency.
Note: Ignore cases and special characters. An easy way to do this is every time you read a line to do: line = line.replaceAll("[^a-zA-Z]", " "); This replaces all non alphabetic characters with whitespaces.
Methods:
Constructor that creates an empty dictionary
put(String word) - a method that stores a word in the dictionary. If the word already exists it should update its frequency. Update total.
double get(String word) - returns the frequency of the specified word. 0 if the word is not found.
double remove(String word) - removes the word from the dictionary and returns the frequency. 0 if the word is not found. Also update total.
double getAverageLength() - returns the average length of the words in the dictionary.
double getAverageFreq() returns the average frequency of the words in the dictionary.
String [] getTopWords(int top) - returns an array of the most popular words in the dictionary. The attribute top specifies the number of words to be returned.
Also, write a main method that reads from the URL/file Alice.txt and stores the words in an object of the class dictionary. Print the top 100 words. Compare the results to the top 100 most commonly used words in the English language:
https://en.wikipedia.org/wiki/Most_common_words_in_English
Alice.txt https://drive.google.com/file/d/0ByG7tKHaTjMWeU9XcUJ2V1pPa1k/view
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
