Question: Introduction to Information Retrieval Sec. 6.4 tf-idf example: Inc.Ito Document: car insurance auto insurance Query: best car insurance Term Query Document Pro d e auto

Introduction to Information Retrieval Sec. 6.4 tf-idf example: Inc.Ito Document: car insurance auto insurance Query: best car insurance Term Query Document Pro d e auto best car insurance tf- tf-wt df idf wt n'liz tf-raw tf-wt wt n'liz raw e 0 0 5000 2.3 0 0 1 1 1 0.52 0 1 1 50000 1.3 1.3 0.34 0 0 0 0 0 1 1 10000 2.0 2.0 0.52 1 1 1 0.52 0.27 1 1 1000 3.0 3.0 0.78 2 1.3 1.3 0.68 0.53 Exercise: what is N, the number of docs? Doc length = 1 +02 +1 +1.32 -1.92 Score = 0+0+0.27+0.53 = 0.8 repeat the calculation for the following 2 documents ( leave query info the same) d1 ( already done) car insurance auto insurance d2 (new doc) car auto insurance auto d3 (new doc) car car auto insurance car Compare the scores between the three documents, then normalize the results 2) Repeat for all three docs for the Jacard coefficient, then normalize the results Introduction to Information Retrieval Sec. 6.4 tf-idf example: Inc.Ito Document: car insurance auto insurance Query: best car insurance Term Query Document Pro d e auto best car insurance tf- tf-wt df idf wt n'liz tf-raw tf-wt wt n'liz raw e 0 0 5000 2.3 0 0 1 1 1 0.52 0 1 1 50000 1.3 1.3 0.34 0 0 0 0 0 1 1 10000 2.0 2.0 0.52 1 1 1 0.52 0.27 1 1 1000 3.0 3.0 0.78 2 1.3 1.3 0.68 0.53 Exercise: what is N, the number of docs? Doc length = 1 +02 +1 +1.32 -1.92 Score = 0+0+0.27+0.53 = 0.8 repeat the calculation for the following 2 documents ( leave query info the same) d1 ( already done) car insurance auto insurance d2 (new doc) car auto insurance auto d3 (new doc) car car auto insurance car Compare the scores between the three documents, then normalize the results 2) Repeat for all three docs for the Jacard coefficient, then normalize the results
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
