Question: In this assignment you will perform a rudimentary sentiment analysis on movie reviews using Spark. The data contains 2000 files with movie reviews and a

In this assignment you will perform a rudimentary sentiment analysis on movie reviews using Spark. The data contains 2000 files with movie reviews and a set of positive and negative words. You need to assign a sentiment to each movie review, i.e. either positive or negative (for each file in the movie review dataset) based on the frequency of the positive and negative words. Example: Positive words: happy, joy, excited, elated Negative words: sad, bad, unhappy, pity Sample Review: I was excited to see this movie. Although the movie had a sad story but had a happy ending Positive score: 2 (excited, happy) Negative score: 1 (sad) Final sentiment: positive Submit a py/ipynb file with a final output showing the final sentiment of each file. The output should be written on HDFS. In this assignment you will perform a rudimentary sentiment analysis on movie reviews using Spark. The data contains 2000 files with movie reviews and a set of positive and negative words. You need to assign a sentiment to each movie review, i.e. either positive or negative (for each file in the movie review dataset) based on the frequency of the positive and negative words. Example: Positive words: happy, joy, excited, elated Negative words: sad, bad, unhappy, pity Sample Review: I was excited to see this movie. Although the movie had a sad story but had a happy ending Positive score: 2 (excited, happy) Negative score: 1 (sad) Final sentiment: positive Submit a py/ipynb file with a final output showing the final sentiment of each file. The output should be written on HDFS
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
