Question: Term - by - Document Matrix A 'term - by - document' matrix is a matrix that relates search terms to documents that contain relevant

Term-by-Document Matrix
A 'term-by-document' matrix is a matrix that relates search terms to documents that contain
relevant search terms. Consider a simple example with nine search terms and seven document titles
For this example, a 97 term-by-document matrix can be constructed as
A=[010110101100000000011000100001100001001000000011000110001001000]
where the rows index indicates the term number and the column index indicates the document
number. An entry of one indicates that the term indexed by the row can be found in the document
title indexed by the column. For example, row 5 indicates that T5 can be found in documents D2
and D3. A given column in the matrix indicates which terms are contained within the associated
document title. For example, column 3 indicates that terms T2, T5 and T8 can be found in
document title D3.
Search Query
Consider searching the above database represented by the term-by-document matrix for books on
'child proofing'. In this case we would search using a term query vector
q=[010000100]
which shows that terms T2 and T7 must be found in the desired document title. The goal then
would be to 'compare' the query vector against all column vectors in the term-by-document matrix
and order the comparison results with some kind of rating system.
Given the term-by-document matrix A and the query vector q shown in the previous sections above,
use Python and Numpy to perform the following computations.
1. Construct the matrix A and the query vector q.
2. Construct a new matrix A by normalizing the columns of A so that each column in A has
unit norm.
3. Normalize the query vector q to form a unit vector q.
4. Compute the dot product between q and every column vector of A. Interpret your calculation
and determine which document titles are most related to the query.
5. Upload your well-documented Python script .py le to the Search query matrix application
Assignment in Canvas. Make sure to create a comments section that provides the
results of your computation and explains your results within your Python script
or you will not receive credit
Term - by - Document Matrix A 'term - by -

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!