Question: Use Jupyter notebok / lab for al codes please. 1 . [ Points 2 0 ] Answer the following questions. Given the training and test

Use Jupyter notebok/lab for al codes please.
1.[Points 20] Answer the following questions. Given the training and test examples.
Training Examples:
I love to watch movies
He loves to watch football
They love watching movies
He plays football every Sunday
Test Example:
I love watching football
Text normalization: apply case lowering and may remove any punctuation characters if any.
a)[Points 10] Show all probability calculations for both unigram and bigram models. Show detailed computations. Please apply add-one smoothing techniques.
b)[Points 5] Calculate the perplexity for both models on the test sentence.
c)[Points 5] Comment on the difference in perplexity between the unigram and bigram models and explain why one might be lower than the other.
2.[points 40] Given the training documents, classify the testing text as either a "Positive" or "Negative" sentiment by answering the questions below.
Training Text:
D1: "I love this movie" (Positive)
D2: "This movie is great" (Positive)
D3: "I hate this movie" (Negative)
D4: "This movie is terrible" (Negative)
Testing Text:
D5: "I love this great movie"
D6: "I hate this terrible movie"
a.[points 2] Tokenize the documents by splitting them into words. Perform case lowering and remove punctuation (if any). Create the vocabulary for the training document.
b.[points 3] Compute the prior class probability P(C), where C is the class label.
c.[points 15] Compute the likelihood/probability, P(W|C) of your given training words using the add-one smoothing approach. Show each calculation in detail.
d.[points 10] Compute the test document class probability for each document (use log10 scale to overcome underflow issues). Compare and decide the class label based on your computation.
e.[points 10] Write a simple code to implement the above steps (a)(d) and show the classification results for the given test set. Your computation should show each step computation (a)(d) in detail.
3.[points 10] Given the following text documents, answer the below questions.
Document 1: "I enjoy watching movies on weekends."
Document 2: "The weather today is sunny and pleasant."
Document 3: "He plays football every Sunday with his friends."
a.[points 5] Provide the tokenized version of the text for each document. Apply text normalization steps - convert all words to lowercase and remove punctuation if necessary. Show each step in detail. What is the vocabulary size (unique words) for each document?
b.[points 5] Generate all context-target word pairs for each document using window size W=2. Explain how the window size W impacts the number of context-target pairs generated.
4.[points 40] Given the initial word embedding from the question 3.
"watching": [0.3,0.1,0.2][0.3,0.1,0.2]
"movies": [0.2,0.4,0.6][0.2,0.4,0.6]
"sunny": [0.7,0.1,0.4][0.7,0.1,0.4]
"football": [0.5,0.3,0.2][0.5,0.3,0.2]
"friends": [0.6,0.3,0.1][0.6,0.3,0.1]
Instructions: Follow the steps when applicable to : i) Simulate the dot product between context and target vectors. ii) Apply gradient descent to update the word vectors (assume a learning rate of 0.01). iii) Perform one iteration of the embedding update.
a)[points 10] Show the word embedding updates after one iteration for the word "movies" when the context word is "watching." Show detailed computations in each step. Explain how the dot product helps capture word similarity during the training process.
b)[points 15] Assume we are performing negative sampling for the word pair ("movies", "watching"). Randomly sample three negative words from the vocabulary: "sunny", "football", "friends".
Compute the dot product between "movies" and its negative samples. Show detailed computations. Explain the purpose of negative sampling and how it improves the efficiency of training Word2Vec models.
c)[points 15] Calculate the cosine similarity between "movies" and "watching" using their updated embeddings from the previous question. Based on the cosine similarity result, explain whether these words are semantically close or not. What threshold would you consider when deciding if two words are similar?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!