Question: Consider ranking of documents using a Language Modeling - based IR algorithm. P ( q , M d ) = p r o d d

Consider ranking of documents using a Language Modeling-based IR algorithm.
P(q,Md)=proddistinctterntin?P(t,Md)tht
where tft,q is the term frequency - number of occurrences of t in query
We estimate the parameters P(t,Md) using Maximum Likelihood Estimate (MLE) as:
(Md|)=tfttdd
where
|d| is the length of document d
tft,d is the term frequency - number of occurrences of t in documen d
To avoid problem with zero probabilities, we smooth the estimates. First, we define:
where
Mc is the collection model.
cft is the number of occurrences of t in the collection.
T=t?cft is the total number of tokens in the collection.
We use (Mc|) to smooth P(t|d) using Jelinek-Mercer smoothing as:
(Md(Mc|)|)
Consider the following documents and d2) and query (q) :
d1= epistemological considerations should also address learning design.
d2= epistemological considerations such as what is being measured.
q : epistemological design
Rank d1 and d2 with respect to q using Jelinek-Mercer smoothing. Use =0.75.
Consider ranking of documents using a Language

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!