Question: Natural language Processing - Programming Assignment Dataset: Choose one category from http://jmcauley.ucsd.edu/data/amazon/ - amazon product review data. Choose at least 25,000 (reviews). [if no. of
Natural language Processing - Programming Assignment
Dataset:
- Choose one category from http://jmcauley.ucsd.edu/data/amazon/ - amazon product review data.
- Choose at least 25,000 (reviews). [if no. of reviews > 25k)
- Review rule, for dataset:
- [overall > 3.0] - positive
- [overall <= 3.0] - negative
Module - 2 (Sentiment Analysis using statistical NLP):
Tasks:-
- Use the following vector space models
- CountVectorizer.
- TF-IDF.
- Any external vectorizer (cite the original paper).
- Do sentiment analysis using all (a,b,c) using classical ML techniques
- Naive Bayes Model.
- Decision Tree.
- Logistic Regression.
- Report metrics [accuracy, f1 score, confusion matrix] for all the combinations in (1 and 2)
- Analyse the results. [Report clearly which vector space model is giving better results on each model used]
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
