Question: Write a function score_document(document,lang_counts=default_lang_counts) which takes as input a document name as a string and a dictionary of dictionaries containing normalised language counts called lang_counts.

Write a function score_document(document,lang_counts=default_lang_counts) which takes as input a document name as a string and a dictionary of dictionaries containing normalised language counts called lang_counts. It should return a dictionary of scores for each language in lang_counts, as obtained by performing a 'dot product' of trigram counts from the document with the normalised language counts. That is, it should multiply the trigram counts from the document with the trigram counts in lang_counts and add the whole lot up. If a trigram from the document is not in the dictionary for a given language, assume the count for the language as zero.

We have provided a stub of code which trains the classifier for you. We have also provided train_classifier(training_set) in a hidden library.

There are also two files included, visible in the tabs at top right. These are en_163083.txt, written in English, and de_1231811.txt, written in German, and can be loaded and used to test your function, which should behave as follows:

>>> test1 = 'en_163083.txt'

>>> d = score_document(test1)

>>> d['Vietnamese']

9.427325768357315

>>> max([(v, n) for (n, v) in d.items()])

(21.428216914833023, 'English')

>>> test2 = 'de_1231811.txt'

>>> d = score_document(test2)

>>> d['Polish']

7.710346556417009

>>> max([(v,n) for (n, v) in d.items()])

(53.12937809633241, 'German')

How to code this in python??

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

python, this code do not work, please help Write a function score_document (document, lang_counts=default_lang_counts) which takes as input a document name as a string and a dictionary of...

Python 3 code. Please 1 from h1dden_11b 1mport count trigrams 2 from h1dden_11b 1mport tra1n class1f1er Write a function score_document (document, lang_counts default_lang_counts) which takes as...

Python 3 code. Please. Problem program.pyen 163083.txt ade 1231811.txt a pl188313.txt 1 from hidden lib import train classifier We are finally ready to put all the pieces together! We can now measure...

Write a Makefile that builds a C program phonebook.c and produces the executable object file phonebook. Your Makefile should use the compiler flags discussed in class, and have the all and clean...

Python 3 w/ PyCharm - Starter file: https://pastebin.com/rPcaHdgi Pokemon are fantastic creatures that people can catch and train. Pokemon are classified into different types such as grass, fire or...

Tasks The goal of the project is to complete the code for the NgramAnalyser, MarkovModel, ModelMatcher and MatcherController classes, as detailed below, and to add test code to a new JUnit test...

Introduction Your task for this assignment is to build a command-line spell-checking program. You may be more familiar with spell checking as a feature inside an editor, but what you will build is a...

Hey can i get the code written in python! i uploaded description of each part for better understanding, Thanks! pokemonLocations Question 2 (23 points): Purpose: To practice the concept of dictionary...

Python code, Python code, Python code. Overview Pokemon are fantastic creatures that people can catch and train. Pokemon are classified into different 'types such as grass, fire or electric. To catch...

Human Resource is every managers business. Why is the management of human resources key to every management job in every organization, regardless of whether there is a human resources department in...

(2 points each) Let lim f(x)=0, lim g(x)=7, and lim_h(x) = -2. Evaluate the follow- 2-1 4-1 ing, if possible. (a) lim (g(x) + 4h(r)] (b) lim h(r) (e) lim [f(r)g(x)) (d) lim (e) g(x) __lim (x 3g(x)]

It information can be used to generate profits, the financial market is not expected to be - form efficient.

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

Question Can a stock bonus plan or ESOP hold life insurance or investments other than employer stock?

Question What are the requirements for a safe harbor 401(k) plan?30

Question Can S corporation ownership of employer stock be used as part of a plan to convert nondeductible benefits for corporate shareholders into deductible items?