Question: Determining the Decile Thresholds We need a python function determine_decile_thresholds(frequencies) that splits the languages tokens into ten parts of equal frequency. The deciles or 10%-level

Determining the Decile Thresholds

We need a python function determine_decile_thresholds(frequencies) that splits the languages tokens into ten parts of equal frequency. The deciles or 10%-level thresholds are the number of word forms that are needed to cover 10% of a text, 20%, 30%, and so on to 90%. The correct result on the English data should imply that 62 word forms are enough to know 50% of all tokens in an English text, whereas to reach 90% of all tokens (a threshold frequently quoted as the threshold to pleasant reading in the literature), passive knowledge of 2,617 English word forms is necessary:

>>> thresholds = determine_decile_thresholds ( frequencies )

>>> thresholds [4:9] [62 , 115 , 252 , 662 , 2617]

def determine_decile_thresholds(frequencies): """ Returns a vector of 9 thresholds delineating 10% of overall token coverage, i.e. the number of tokens that are needed to cover 10%, 20%, 30%, etc. of all tokens. """

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!