Question: JAVA example program assuming that there are txt files that contain negative and another that contains positive words. as well as a testdata file that
JAVA example program
assuming that there are txt files that contain negative and another that contains positive words. as well as a testdata file that contains peoples tweets.
I am new to java so an explanation would be nice pls

Goal: The goal of this assignment is to help students familiarize themselves with the following Java programming concepts 1. Input Output to and from the terminal 2. Storing data in a file and reading data from a file 3. Creating object-oriented classes and methods to handle data. 4. Using data structures to store data in main memory (e.g. HashSet) 5. Working with character strings 6. Using Javadoc comments and generating and html documentation of the program Description For this assignment you will create a program to classify a set of Tweets as positive, negative, or neutral based on their sentiment. This process is known as Sentiment Analysis. More information about sentiment analysis can be found on Wikipedia and other sources. Although complex algorithms have been developed for sentiment analysis, in this assignment we wil classify a tweet as positive, negative, or neutral, by just counting the number of positive and negative words that appear in that tweet. These positive and negative words will be given to the program as input in two separate files, namely positive-words.txt and negative-words.txt. The set of tweets to be classified, will be also given as an input to the program in a CSV (comma separated values) fietstdata.manual.2009.06.14.csv The file contains 498 tweets extracted using the twitter APl. The tweets have been annotated (0 negative, 2 = neutral, 4-positive) and they can be used to detect sentiment. It contains the following 6 fields 1. target: the polarity of the tweet (0 negative, 2 neutral, 4 positive) 2. ids: The id of the tweet ( 2087) 3. date: the date of the tweet (Sat May 16 23:58:44 UTC 2009) 4. flag: The query (lyx). If there is no query, then this value is NO QUERY 5. user: the user that tweeted (robotickilldozr) 6. text: the text of the tweet We only care about fields 1 and 6 in this file Your program should operate in the following manner: 1. When the program starts, it asks the user to provide the file paths of the positive words, negative words, and twitter data file The program loads the positive words and negative words and stores them in two separate lookup tables. The HashSet data structure can be used as a lookup table in Java as it provides a fast way to look if a word exists in it or not. The program iterates over the tweets in the twitter data file and it counts the number of positive and negative words that the tweet contains. If the tweet contains more positive than negative words it is classified as positive, and vice versa. If no positive or negative words were found on the tweet, it is classified as neutral. If the same number of positive and negative words were found on the tweet, it counts as negative After each tweet has been classified, the program prints out in the command line the tweet itself its real label and its predicted label At the end the program should also print how many tweets were correctly classified and how manv were misclassified 2. 3. 4. 5. Hint: Java provides the method spliti) which allows us to split a String into multiple tokens by specifying a separator character String animals "dog, cat, bear, elephant, giraffe"; Stringl) animalsArray animals.split"); For each line in the twitter data file Step 1: Split by ,into String[ array e.g. categories Step 2: Remove from categories(0] extra: if categories[6] on exist, merge all with categories Step 3: Remove punctuation marks from the categories(51 string using regex p Punct https:/docs.oracle.com/javase/8/docs/api/java/util regexiPattern.html Step 4: Split categories[5 into String) words array using white space separator Step 5: Check if each element of words appear in HasSet of positive words and HasSet of negative words (see contains method) and update the respective counters if true
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
