Question: When you are given a text file, parse, tokenize, and further split the tokens into specified sized letter groups. JAVA Example: Input : Second Programming

When you are given a text file, parse, tokenize, and further split the tokens into specified sized letter groups.

JAVA

Example:

Input: Second Programming Assignment

Tokens: second programming assignment

Letter Groups (2) : "se" "ec" "co" "on" "nd" "pr" "ro" "og" "gr" "ra" "am" "mm" "mi" "in" "ng" "as" "ss" "si" "ig" "gn" "nm" "me" "en" "nt"

Letter Groups (3): "sec" "eco" "con" "ond" "pro" "rog" "ogr" "gra" "ram" "amm" "mmi" "min" "ing" "ass" "ssi" "sig" "ign" "gnm" "nme" "men" "ent"

After generating the letter groups, generate the histogram (frequency of occurrence) of the letter groups.

The name of the text file will be the first argument of your main function and letterGroupLen will be the second argument of your main function.

Parse the input text file

You will have 2 classes in your design. SentenceUtils and Histogram.

SentenceUtils class will tokenize and partition the tokens into letter groups

Histogram class using a HashMap data structure will count the number of occurrence of each letter group, and print the results when requested.

The solution is composed of 2 classes: a SentenceUtils Class to convert a string into letterGroups and also an Histogram class to do histogram processing. These 2 functionalities are independent of each other.

You need to implement body of the following 4 functions and also the main function:

private String[] getTokens(String line)

private void splitTokenstoLetterGroups(String[] tokens)

public void generateHistogram(ArrayList letterGroups)

public void printHistogram()

Please test your program using the input files : input1.txtWhen you are given a text file, parse, tokenize, and further splitthe tokens into specified sized letter groups. JAVA Example: Input: Second Programming and test.txtAssignment Tokens: second programming assignment Letter Groups (2) : "se" "ec" "co""on" "nd" "pr" "ro" "og" "gr" "ra" "am" "mm" "mi" "in" "ng". We are going to test using some other files also.

SentenceUtils.java"as" "ss" "si" "ig" "gn" "nm" "me" "en" "nt" Letter Groups (3):"sec" "eco" "con" "ond" "pro" "rog" "ogr" "gra" "ram" "amm" "mmi" "min"

"ing" "ass" "ssi" "sig" "ign" "gnm" "nme" "men" "ent" After generating the

letter groups, generate the histogram (frequency of occurrence) of the letter groups.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!