Question: Problem 2 Word embedding as features for classification Task Implement a sentiment classifier based on Twitter data to analyse the sentiments of COVID - 1

Problem

2

Word embedding as features for classification

Task

Implement a sentiment classifier based on Twitter data to analyse the sentiments of COVID

- 19

tweets.

Train and test multiple classification model using necessary libraries with the features being sentence embeddings of tweets.

Report the accuracy and F

1

score

(

micro

-

and macro

-

averaged

)

for multiple classifier and discuss the differences.

Dataset

The dataset have been provided in the first code trunk with the assignment. You are required to use the original tweet text for this classification task.

Tweet representation

After necessary pre

-

processing of the tweets, convert the words into their embeddings, then take the mean of all the word vectors in a tweet to end up with a single vector representing each tweet. The tweet vector is then used for sentiment classification.

In the process of finding the embeddings for each word, you can ignore out

-

-

vocabulary words.

Classifier choice

You are required to implement the following TWO classifiers:

One tradition classification model

(

not a neural network based model

)

One classifier based on any neural network based model.

You can use PyTorch

/

TensorFlow

/

scikit

-

learn to implement your classifier. However, you are free to develop a classifier from scratch.

Your answer must include the following:

Code for data loading, data pre

-

processing, training, and testing of the models.

A discussion on the comparison between the classifiers based on classifier accuracy and F

1

score.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Problem 1 TF - IDF Implement TF - IDF using using Python, Numpy, Pandas and whatever text cleaning library required. The tf idf is the product of two statistics, term frequency and inverse document...

Questions: 1. With the findings of the study, how the three companies can plan product Improvements 2. With the findings of the study, how the three companies can prioritize customer service issues....

briefly discuss polarity classification methodolgies for opinion mining. This article is organized as follows: In Section 2, key elements of the polarity clas- sification task are explained, and...

2015 lEEE Jordan Conference on Applied Eiechicat Engineering and Computing Technologies {AEECT} Twitter Sentiment Analysis: A Case Study in the Automotive Industry Sarah E. Shulcri Rawan I, Yaghi...

Sentiment Analysis for Customer Review: Case Study of GO-JEK Expansion Abstract Background: Market prediction is an important thing that needs to be analyzed deeply. Business intelligence becomes an...

Answer three questions below on the article. 1- Racial slur was mentioned in the article. Provide a definition for this term. And analyze how the association racial slur was related to the online...

contributed articles DOI:10.1145/ 2602574 How to use, and influence, consumer social communications to improve business performance, reputation, and profit. BY WEIGUO FAN AND MICHAEL D. GORDON The...

I want test and train accuraciies in one valu Task 2: Perceptron for binary classification. Perceptron is a supervised learning algorithm for classification or regression. In supervised learning, you...

A plane is flying at 25.0 north of west at 190 km/h and encounters a wind from 15.0 north of east at 45.0 km/h. What is the planes new velocity with respect to the ground in standard position?

Soybean meal currently costs $10.30 per bushel and has 54% CP. Dried distillers grains have 30.5% protein and cost $215.00 per ton. What is the cost per pound of protein? Which is more economical...

Do you think it is ethical for companies like Microsoft to continue to hold cash overseas in order to avoid paying US corporate income taxes? Is this practice always in the best interests of the...

Hello, I need help with these, thank you! 2) * Your answer is incorrect. In a nonmonetary exchange of plant assets, accounting recognition should not be given to a portion of the gain when the...

1. Which position would you take?

6. What reaction would you expect of employees at the postconventional level?

2. What reaction would you expect of employees at the preconventional level?