Question: A library management system is used to store information about books. Each book can be associated with a set of tags. The system allows users

A library management system is used to store information about books. Each book can
be associated with a set of tags. The system allows users to search for books by tag.
Consider the following book collection C which has 5 books: C={B1,B2,B3,B4,B5}.
Let Dict be a dictionary which consists of 6 tags: Dict ={t1= Science Fiction, t2=
Adventure, t3= Non-Fiction}.
a) Denote by tf(t,M) the tag frequency (TF) of the tag t in the book B. Please fill out the
blank cells in the following table, i.e., give the values tf(tiBj). The value tf(tiBj) should
be put into the cell specified by ti and Mj.(Note, you can copy the form to your answer
sheet and the fill it out.)
b) Recall the inverse document frequency (IDF) is defined as idf(t,C)=ln(|C||Ct|). Here |C|
denotes the number of books in the collection C,|Ct| denotes the number of books from
C that contains tag t. Please compute idf(t1,C) and idf(t3,C). c) Tag frequency inverse document frequency (TF-IDF) takes both tag frequency (TF) and inverse document frequency (IDF) into consideration. For the book collection , the tag s TF-IDF value on book is defined as , i.e., the product of s TF value on and s IDF value. Please compute and .
d) Using TF-IDF, a book can be represented by a multi-dimension vector of TF-IDF values of all tags in the dictionary. Compute the vector for each book.
e) Similarly, any given query can be represented by a multi-dimension vector. Compute the vector for the query Science Fiction.
f) Using document vectors, we can compute the relevance score for each book to a given query using cosine similarity between the book vector and the query vector. A retrieval result of a query is the ranking of the books in the decreasing order of relevance scores. What is the result of the query Science Fiction?
g) What is the result of the query Science? Explain why. Which solution do you suggest to include B2 and B3 in the result.
 A library management system is used to store information about books.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!