Question: Implement a python function build_unigram_probs(unigrams, unigram_counts, total_count) which takes a list of all of the unique words in the book, a dictionary mapping unique unigrams
In order to do this, you should iterate through the indexes of the unigram list. Look up the count of the indexs corresponding unigram in unigram_counts, then divide the unigram count by the total_count to get the probability that the word at the same index in unigrams would be chosen at random from the book. Return the list of probabilities
def test_build_unigram_probs():
assert(build_unigram_probs(\
[ "hello", "world", "again"],
{ "hello" : 2, "world" : 2, "again" : 1 }, 5 ) == \
[ 2/5, 2/5, 1/5 ])
assert(build_unigram_probs(\
[ "hello", "and", "welcome", "to", "the", "program", ".", "we're", "happy", "have", "you"],
{ "hello" : 1, "and" : 1, "welcome" : 1, "to" : 2, "the" : 1, "program" : 1, "." : 2,
"we're" : 1, "happy" : 1, "have" : 1, "you" : 1 }, 13) == \
[ 1/13, 1/13, 1/13, 2/13, 1/13, 1/13, 2/13, 1/13, 1/13, 1/13, 1/13 ])
assert(build_unigram_probs(\
[ "this", "is", "the", "song", "that", "never", "ends", "yes", "it",
"goes", "on", "and", "my", "friends", "!", "some", "people", "started",
"singing", ",", "not", "knowing", "what", "was", "now", "they", "keep",
"forever", "just", "because", "." ],
{ "this" : 1, "is" : 1, "the" : 1, "song" : 1, "that" : 1, "never" : 1,
"ends" : 1, "yes" : 1, "it" : 4, "goes" : 1, "on" : 3, "and" : 2,
"my" : 1, "friends" : 1, "!" : 1, "some" : 1, "people" : 1,
"started" : 1, "singing" : 2, "," : 2, "not" : 1, "knowing" : 1,
"what" : 1, "was" : 1, "now" : 1, "they" : 1, "keep" : 1,
"forever" : 1, "just" : 1, "because" : 1, "." : 3 }, 41) == \
[ 1/41, 1/41, 1/41, 1/41, 1/41, 1/41, 1/41, 1/41, 4/41, 1/41, 3/41, 2/41,
1/41, 1/41, 1/41, 1/41, 1/41, 1/41, 2/41, 2/41, 1/41, 1/41, 1/41, 1/41,
1/41, 1/41, 1/41, 1/41, 1/41, 1/41, 3/41 ])
print("... done!")
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
