Question: Question 1 In information retrieval, extremely common words which would appear to be of little value in helping select documents that are excluded from the
Question 1
In information retrieval, extremely common words which would appear to be of little value in helping select documents that are excluded from the index vocabulary are called:
Select one:
a. Stop Words
b. Tokens
c. Lemmatized Words
d. Stemmed Terms
Question 2
A process that reduces the size of a vocabulary by reducing to the 'root' of words is called:
Select one:
a. Stemming
b. Lemmatizing
c. Removal of stop words
d. Posting
e. pruning
Question 3
True/False: Given two strings s1 and s2, the edit distance between them is sometimes known as the Levenshtein distance.
Select one:
True
False
Question 4
True/False: In the bag of words model, the exact ordering of terms within the document is not relevant to processing.
Select one:
True
False
Question 5
An approach to compression that takes advantage of the redundancy in the dictionary that results from common prefixes that come from sorted terms is called:
Select one:
a. Front Coding
b. Blocked storage
c. Prefix Coding
d. Variable byte encoding
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
