Question: Sentiment Analysis using JAVA with given .csv and .txt files... given DATA.zip files... https://drive.google.com/file/d/1Mx2-rd-4EsE8S7Bhtwa5iqr9jEguzSAj/view?usp=sharing Project Title: Sentiment Analysis (stage 1) Goal: The goal of this

Sentiment Analysis using JAVA with given .csv and .txt files...

Sentiment Analysis using JAVA with given .csv and .txt files... given DATA.zipfiles... https://drive.google.com/file/d/1Mx2-rd-4EsE8S7Bhtwa5iqr9jEguzSAj/view?usp=sharing Project Title: Sentiment Analysis (stage 1) Goal: The goal of

given DATA.zip files...

https://drive.google.com/file/d/1Mx2-rd-4EsE8S7Bhtwa5iqr9jEguzSAj/view?usp=sharing

Project Title: Sentiment Analysis (stage 1) Goal: The goal of this assignment is to help students familiarize themselves with the following Java 1. Input/Output to and from the terminal 2. Storing data 3. Creating object-oriented classes and methods to handle data. 4. Using data structures to store data in main memory (e.g. HashSet) 5. Working 6. Using Javadoc comments and generating and html documentation of the program in a file and reading data from a file. with character strings Description For this assignment you will create a program to classify a set of Tweets as positive, negative, or neutral based on their sentiment. This process is known as Sentiment Analysis. More information about sentiment analysis can be found on Wikipedia and other sources. Although complex algorithms have been developed for sentiment analysis, in this assignment we will classify a tweet as positive, negative, or neutral, by just counting the number of positive and negative words that appear in that tweet These positive and negative words will be given to the program as input in two separate files, namely The set of tweets to be classified, will be also given as an input to the program in a CSV (comma separated values) file: testdata.manual 2009.06.14.csv The file contains 498 tweets extracted using the twitter API. The tweets have been annotated (0 negative, 2neutral, 4-positive) and they can be used to detect sentiment. It contains the following 6 fields: 1. target: the polarity of the tweet (0 negative, 2-neutral, 4positive) 2. ids: The id of the tweet ( 2087) 3. date: the date of the tweet (Sat May 16 23:5844 UTC 2009) 4. flag: The query (lyx). If there is no query, then this value is NO QUERY 5. user: the user that tweeted (robotickilldozr) 6. text: the text of the tweet We only care about fields 1 and 6 in this file. Your program should operate in the following manner 1. When the program starts, it asks the user to provide the file paths of the positive words, negative words, and twitter data file. The program loads the positive words and negative words and stores them in two separate lookup tables. The HashSet data structure can be used as a lookup table in Java as it provides a fast way to look if a word exists in it or not. 2. 3. The program iterates over the tweets in the twitter data file and it counts the number of positive and negative words that the tweet contains. If the tweet contains more positive than negative words it is classified as positive, and vice versa. If no positive or negative words were found on the tweet, it is classified as neutral. It the same number of positive and negative words were found on the tweet, it counts as negative. 4. After each tweet has been classified, the program prints out in the command line the tweet itself, its real label and its predicted label. 5. At the end the program should also print how many tweets were correctly classified and how many were misclassified. Project Title: Sentiment Analysis (stage 1) Goal: The goal of this assignment is to help students familiarize themselves with the following Java 1. Input/Output to and from the terminal 2. Storing data 3. Creating object-oriented classes and methods to handle data. 4. Using data structures to store data in main memory (e.g. HashSet) 5. Working 6. Using Javadoc comments and generating and html documentation of the program in a file and reading data from a file. with character strings Description For this assignment you will create a program to classify a set of Tweets as positive, negative, or neutral based on their sentiment. This process is known as Sentiment Analysis. More information about sentiment analysis can be found on Wikipedia and other sources. Although complex algorithms have been developed for sentiment analysis, in this assignment we will classify a tweet as positive, negative, or neutral, by just counting the number of positive and negative words that appear in that tweet These positive and negative words will be given to the program as input in two separate files, namely The set of tweets to be classified, will be also given as an input to the program in a CSV (comma separated values) file: testdata.manual 2009.06.14.csv The file contains 498 tweets extracted using the twitter API. The tweets have been annotated (0 negative, 2neutral, 4-positive) and they can be used to detect sentiment. It contains the following 6 fields: 1. target: the polarity of the tweet (0 negative, 2-neutral, 4positive) 2. ids: The id of the tweet ( 2087) 3. date: the date of the tweet (Sat May 16 23:5844 UTC 2009) 4. flag: The query (lyx). If there is no query, then this value is NO QUERY 5. user: the user that tweeted (robotickilldozr) 6. text: the text of the tweet We only care about fields 1 and 6 in this file. Your program should operate in the following manner 1. When the program starts, it asks the user to provide the file paths of the positive words, negative words, and twitter data file. The program loads the positive words and negative words and stores them in two separate lookup tables. The HashSet data structure can be used as a lookup table in Java as it provides a fast way to look if a word exists in it or not. 2. 3. The program iterates over the tweets in the twitter data file and it counts the number of positive and negative words that the tweet contains. If the tweet contains more positive than negative words it is classified as positive, and vice versa. If no positive or negative words were found on the tweet, it is classified as neutral. It the same number of positive and negative words were found on the tweet, it counts as negative. 4. After each tweet has been classified, the program prints out in the command line the tweet itself, its real label and its predicted label. 5. At the end the program should also print how many tweets were correctly classified and how many were misclassified

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!