Question: from math import inf, log from typing import List, Optional from utils import max_word_length, is_valid, word_prob # Part B: Probabilistic reconstruction def likely_reconstruct(document: str) ->

from math import inf, log from typing import List, Optional from

from math import inf, log from typing import List, Optional from utils import max_word_length, is_valid, word_prob # Part B: Probabilistic reconstruction def likely_reconstruct(document: str) -> Optional[str]: """ Finds the **most likely** reconstruction of a string with no whitespace. :param document: A nonempty string of letters, stripped of all whitespace and punctuation. :return: A string which is the most likely reconstruction of the input, or None if all reconstructions have zero probability. """ if len(document.split()) > 1: raise ValueError('Document must not contain any whitespace.') # todo

Given a string [1..n] with no whitespace, your goal is to reconstruct the string by splitting it into valid words separated by spaces (if possible). For example, "it was the best of times" is a reconstruction of "itwasthebestoftimes". Words may also contain punctuation: "I'llhavewhatshe'shaving" can be reconstructed as "I'll have what she's having" . Some strings cannot be reconstructed, such as "qwertyuiop". Install dependencies We need an English dictionary, so install the wordfreq library by running pip install wordfreq (Windows) or pip install wordfreq (Mac/Linux). Part B There is actually a fairly easy way to improve the output of this naive algorithm. Rather than settling for any reconstruction of the string, we can look for the most likely reconstruction. To do this, we assume that each word w is picked independently with probability P/w). The probability of some reconstruction is the product of the probabilities of the individual words. For example, the probability of the sentence "This is good" is equal to P("This") * P("is") * P("good"). We compute the reconstruction with the maximum probability. Implement this strategy in the likely_reconstruct function in hw3.py. Once again, your algorithm should run in O(nk) time. Use the provided word_prob function to compute a word's probability, as shown below. It will ignore whitespace and leading / trailing punctuation. >>> word_prob("the") 8.588843655955589 >>> word_prob ("end.'") a.ee24897788193684461 >>> word_prob("zxcvbnm") a. @ >>> word_prob ("not a ward") Traceback (most recent call last): ValueError: Invalid argument: not a word Words must not contain whitespace Words with zero probability are considered invalid (not in the dictionary). Multiplying probabilities is not a great idea, because it leads to underflow: >>> .Bee ** 60 1.0000000000000048e-300 >>> (.ee281 ** 60 * (.00003 ** 10) >>> from sys import float_info >>> float_info.min 2.2258738585072014e-288 To avoid this, you can use the following trick: >>> from math import log >>>> -log(.ee001 ** 60 * .00003 ** 10) Traceback (most recent call last): File " ValueError: math domain error > > > -log(.e0001 ** ) 699.7755278982137 >>> -log(.e0083 ** 10) 184.14313176382119 >>> -log(.60081 ** 60) + -log(.ee283 ** 10) 794.9186596612349 Note that this function increases in value as the probability of an event decreases. The most likely reconstruction is the one which minimizes this cost function. Given a string [1..n] with no whitespace, your goal is to reconstruct the string by splitting it into valid words separated by spaces (if possible). For example, "it was the best of times" is a reconstruction of "itwasthebestoftimes". Words may also contain punctuation: "I'llhavewhatshe'shaving" can be reconstructed as "I'll have what she's having" . Some strings cannot be reconstructed, such as "qwertyuiop". Install dependencies We need an English dictionary, so install the wordfreq library by running pip install wordfreq (Windows) or pip install wordfreq (Mac/Linux). Part B There is actually a fairly easy way to improve the output of this naive algorithm. Rather than settling for any reconstruction of the string, we can look for the most likely reconstruction. To do this, we assume that each word w is picked independently with probability P/w). The probability of some reconstruction is the product of the probabilities of the individual words. For example, the probability of the sentence "This is good" is equal to P("This") * P("is") * P("good"). We compute the reconstruction with the maximum probability. Implement this strategy in the likely_reconstruct function in hw3.py. Once again, your algorithm should run in O(nk) time. Use the provided word_prob function to compute a word's probability, as shown below. It will ignore whitespace and leading / trailing punctuation. >>> word_prob("the") 8.588843655955589 >>> word_prob ("end.'") a.ee24897788193684461 >>> word_prob("zxcvbnm") a. @ >>> word_prob ("not a ward") Traceback (most recent call last): ValueError: Invalid argument: not a word Words must not contain whitespace Words with zero probability are considered invalid (not in the dictionary). Multiplying probabilities is not a great idea, because it leads to underflow: >>> .Bee ** 60 1.0000000000000048e-300 >>> (.ee281 ** 60 * (.00003 ** 10) >>> from sys import float_info >>> float_info.min 2.2258738585072014e-288 To avoid this, you can use the following trick: >>> from math import log >>>> -log(.ee001 ** 60 * .00003 ** 10) Traceback (most recent call last): File " ValueError: math domain error > > > -log(.e0001 ** ) 699.7755278982137 >>> -log(.e0083 ** 10) 184.14313176382119 >>> -log(.60081 ** 60) + -log(.ee283 ** 10) 794.9186596612349 Note that this function increases in value as the probability of an event decreases. The most likely reconstruction is the one which minimizes this cost function

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

from math import inf, log from typing import List, Optional from utils import max_word_length, is_valid, word_prob # Part A: Naive reconstruction def naive_reconstruct(document: str) ->...

By using python language. Labtestz, FL2017 pd Problem A straight line can be defined by a pair of points piCx?, y) and pibo, y2) The slope m of a line is defined as follow Your program reads the...

Change the code so that Tic-tac-toe player uses alpha-beta search #!/usr/bin/env python3 from math import inf as infinity from random import choice import platform import time from os import system...

This is the code snippet below: from math import inf import networlot as nx class DiGraphWithEdgeCheckerinxDiGraph): ln NetworkX's implementation of add_edge. the nodes u and y will be automatically...

Need help getting started on these questions. I am supposed to add code where it says "implement me" and write the answer where it says answer in one or two line. Need to fill in the "Implement me"...

Need to fill in all parts that say "Implement me" and answer in one or two lines here. The following cell contains code that will be referred to as the Preprocessing Block from now on. It contains a...

ONLY NEED TO EDIT distortion.py and interpolation.py # Please do not change the structure Do not import cv2, numpy and other third party libs Distortion: Write code to perform barrel distortion on an...

Distortion: Write code to perform barrel distortion on an image. Starter code available in directory Tranform/ Transform/distortion.py: Edit the function distortion to implement this part. Correction...

Describe a business scenario, either real or fictional, that depicts each of the following forms of business organization: A. Joint-stock company, B. Limited liability company, C. Partnership, D....

Imagine that you have been asked to lead a focus group as a part of your organization's commitment to creating and maintaining a diverse workforce. As a manager, you will work with your team to...

The owner of a U . S . patent has the rights to actually practice ( make , have made, use, offer for sale, sell, and import ) an invention.

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

In the Data Source View in Visual Studio, what option is available to view data in any Source View Table? What are the primary uses this capability?

What Microsoft Analysis Services Extension for Visual Studio 2017 needs to be installed before beginning work on a Multidimensional OLAP Cube Project? How can the installation be verified?

Why would the FedScope Employment database be more representative of the General Population in terms of Salary Data than the CPS studies?