Question: from math import inf, log from typing import List, Optional from utils import max_word_length, is_valid, word_prob # Part B: Probabilistic reconstruction def likely_reconstruct(document: str) ->

from math import inf, log from typing import List, Optional from utils import max_word_length, is_valid, word_prob # Part B: Probabilistic reconstruction def likely_reconstruct(document: str) -> Optional[str]: """ Finds the **most likely** reconstruction of a string with no whitespace. :param document: A nonempty string of letters, stripped of all whitespace and punctuation. :return: A string which is the most likely reconstruction of the input, or None if all reconstructions have zero probability. """ if len(document.split()) > 1: raise ValueError('Document must not contain any whitespace.') # todo Given a string [1..n] with no whitespace, your goal is to reconstruct the string by splitting it into valid words separated by spaces (if possible). For example, "it was the best of times" is a reconstruction of "itwasthebestoftimes". Words may also contain punctuation: "I'llhavewhatshe'shaving" can be reconstructed as "I'll have what she's having" . Some strings cannot be reconstructed, such as "qwertyuiop". Install dependencies We need an English dictionary, so install the wordfreq library by running pip install wordfreq (Windows) or pip install wordfreq (Mac/Linux). Part B There is actually a fairly easy way to improve the output of this naive algorithm. Rather than settling for any reconstruction of the string, we can look for the most likely reconstruction. To do this, we assume that each word w is picked independently with probability P/w). The probability of some reconstruction is the product of the probabilities of the individual words. For example, the probability of the sentence "This is good" is equal to P("This") * P("is") * P("good"). We compute the reconstruction with the maximum probability. Implement this strategy in the likely_reconstruct function in hw3.py. Once again, your algorithm should run in O(nk) time. Use the provided word_prob function to compute a word's probability, as shown below. It will ignore whitespace and leading / trailing punctuation. >>> word_prob("the") 8.588843655955589 >>> word_prob ("end.'") a.ee24897788193684461 >>> word_prob("zxcvbnm") a. @ >>> word_prob ("not a ward") Traceback (most recent call last): ValueError: Invalid argument: not a word Words must not contain whitespace Words with zero probability are considered invalid (not in the dictionary). Multiplying probabilities is not a great idea, because it leads to underflow: >>> .Bee ** 60 1.0000000000000048e-300 >>> (.ee281 ** 60 * (.00003 ** 10) >>> from sys import float_info >>> float_info.min 2.2258738585072014e-288 To avoid this, you can use the following trick: >>> from math import log >>>> -log(.ee001 ** 60 * .00003 ** 10) Traceback (most recent call last): File "
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
