Question: The function get_most_relevant takes two arguments: docs, documents, in the form of a dictionary (keys and values are both strings) and query in the form

The function get_most_relevant takes two arguments: docs, documents, in the form of a dictionary (keys and values are both strings) and query in the form of a tuple of strings, and returns the list with document(s) having the highest relevance score. The relevance score is based on the number of times the provided query terms (words) appear in the corresponding document. More specifically, we define term frequency (TF) to be the frequency of a term t in a document d divided by the number of words in the document d. As an equation, this is TF=f(t,d)/ len (d) where f(t,d) is the frequency of term t (query word) in the document d and len (d) is the total count of terms (words) in the document d. If some term t does not appear in the document string, its score should be 0 , and it does not contribute to the relevance score. You should assume that sentences do not contain punctuation and words can be identified using white space. Documents should be lowercased. The query words are always provided in lowercase. If docs or query is empty, the function should return None. Some example calls to the function are: > get_most_relevant ({doc1:A cat sat on the mat', ', [ 'doc1', 'doc2'] get_most_relevant ({},()) None get_most_relevant(\{'doc1':'A cat sat on the mat', ', ['doc 2] get_most_relevant ({doc1:A cat sat on the mat', ', [] Provide one line of code in each blank to complete the function! Write your answers by replacing each "BLANK \#" comment with the correct line of code in the editor. def get_most_relevant(docs, query): \# if either is empty return None \# BLANK \#1 - PLEASE FILL HERE return None \# BLANK \#2 - PLEASE FILL HERE \# for each document, compute relevance score for doc_no, doc_text in docs.items(): \# BLANK \#3 - PLEASE FILL HERE relevant_count = len([word for word in doc_words if word in query]) prop_no] = relevant_count / len (doc_words) \# find the max relevance score \# BLANK \#4 - PLEASE FILL HERE \# return all tied relevant docs if non-zero else empty list return sorted(k for k, v in prop_dict.items() if v == top) if top != 0 else

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Write a function get_relevant which takes two arguments: docs, documents, in the form of a dictionary (keys and values are both strings) and query in the form of a list of strings, and return the...

Testcase file with output 1 Overview Wordle is an online word game, where you guess words and the computer tells you how close the word is to the secret word you're trying to find. Like the old game...

Task 1: Distance Map Requires: knowing how to design and implement a class In file distance_map.py, use the Class Design Recipe to define a class called DistanceMap that lets client code store and...

Using Python I can't figure it out the rest of the problem. What am I doing wrong on the one I'm getting errors on? These are the processes I created in the script. This is what I'm supposed to do in...

I can't figure these python problems out. I posted the functions below to solve the problems. I ONLY NEED THE PROBLEMS SOLVED, NOT THE FUNCTIONS...

3 CCPS506 Project Description Preamble You will write a program that plays the game of Boggle (with some minor modifications, described below). You are given an N*N grid of letters, and your task is...

CS 112 Project 5 Dictionaries and File IO Due Date: Sunday, April 23rd, 11:59pm Last chance to use tokens! (P6 won't allow late submissions) The purpose of this assignment is to explore dictionaries...

I need help with Python. These are the functions created in a script. I don't need help with the functions. I need help with the problems that are done after the functions are created. I created the...

Need help with finishing these functions by Python code and the code should be satisfied with these tests import unittest import re def string_splitting ( input_str, delim.): Split the given.string,...

Python : most work for any case not just test cases Function name: football_stars Parameters: Two dictionaries, one with strings of states as keys and lists of NFL team name strings as values, and...

Ariel builds innovative loudspeakers for music and home theatre. Identify the following costs as variable or fixed: a. Depreciation on equipment used to cut wood speaker enclosures b. Wood for...

Continuation of Exercise 5-34. Determine the following: (a) P(X 1.8, 1

Which of the following is cited as a good reason for NOT hedging currency exposures? Group of answer choices Currency risk management through hedging does not increase expected cash flows. All of...

(50 Points) Consider a pump station involving three pumps (Model 3756) where two pumps are connected in parallel and another pump is in series with other two. These pumps are delivering oil (specific...

How do Dimensional Database Models differ from Relational Models?

What type of processing do Relational Databases support?

Describe several aggregation operators.