Question: In this question I already did a program to find their matching scores but I don't know how to get their rankings results can you

In this question I already did a program to find their matching scores but I don't know how to get their rankings results can you please explain how to find the rankings results based on the question provided

In this question I already did a program to find their matching

1. In an information retrieval (IR) system (e.g., web search engine or library search system), documents (e.g., webpages) are organized as texts composed of different terms. Given a query, the information retrieval system will search the documents for the terms contained in the query and count the term frequency (tf). After that, the system will evaluate the relevance of the documents to the query and return a list of documents based on their ranking. The ranking of the documents is calculated using the term frequency for a matching score. A typical formula is as follows: (a) For a term appeared in both the query (q) and the document (d), its weight is calculated based on the term frequency tit,d: The log frequency weight of term t in document d is: Wid [1 + 10g,. the, if tf>0 O. otherwise " E.g., 0 -> 0, 1 -> 1, 2 -> 1.3, 10 -> 2, 1000 -> 4, etc. (b) Sum over terms t in both q and d using the weights: matching score(a, d) = > Wt.d t Eand For example, given a query "apple computer" and there are three documents in the IR system which are (1) "apple computer is excellent"; (2) "apple cider, apple jam, apple tree"; (3) "computer science and computer engineering"; To calculate the matching scores of the documents: for document (1), it contains both terms "apple" and "computer , each term's frequency is one. Thus, the weight for "apple" is 1 + log10(1) = 1, the weight for "computer" is also 1 + log10(1) = 1, the total matching score is 1+ 1 = 2. For document (2), it only contains query term "apple" and its frequency is 3, thus, the weight for "apple" is 1 + log10(3) = 1.47, and the total matching score is 1.47; For document (3), it only contains query term "computer" and its frequency is 2, thus, the weight for "apple" is 1 + log10(2) = 1.30, which is also the total matching score. Thus, the ranking of the three documents will be (1), (2), and (3). In real world (e.g., web search engines), the higher the ranking, the higher the displaying order a document (or linked web page) will gain. Lower ranked documents are often dropped (i.e., treated as not relevant). Following the above example, given a query "machine learning", if there are four documents in the IR system, (1) "learning English, learning French, learning Spanish"; (2) "machine learning, statistical learning, and deep learning"; (3) "MACHINE LEARNING FOR FUN"; (4) "Computing Machine and Sewing Machine", write a Python program to implement term searching. frequency counting, and calculating the matching scores

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!