Question: the lext-Field given below. a) Given a corpus C, the maximum likelihood estimation (MLE) for the bigram Hello World is 0.3 and the count of

the lext-Field given below. a) Given a corpus C, the maximum likelihood estimation (MLE) for the bigram "Hello World" is 0.3 and the count of occurrence of the word "Hello" is 580 for the same corpus, the likelihood of "Hello World" after applying the add-one smoothing is 0.04. What is the vocabulary size of Corpus C. [3 marks] b)What are the challenges in the Natural Language Processing? [3 marks] c)There were 100 documents and each document contained one word. 30 of these documents contained the word "hello". I asked Bob to separate all the documents containing the word "hello". He showed me 60 but "hello" was not in 40 of them. Construct the confusion matrics and calculate the accuracy. [4 marks] Options
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
