Question: /** * Inserts a document into the search engine for later analysis and retrieval. * * The document is uniquely identified by a documentId; attempts

/**

* Inserts a document into the search engine for later analysis and retrieval.

*

* The document is uniquely identified by a documentId; attempts to re-insert the same

* document are ignored.

*

* The document is supplied as a Reader; this method stores the document contents for

* later analysis and retrieval.

*

* @param documentId

* @param reader

* @throws IOException iff the reader throws an exception

*/

public void addDocument(DocumentId documentId, Reader reader) throws IOException {

String s = "";

BufferedReader br = new BufferedReader(reader);

while((s = br.readLine()) != null) {

list = Arrays.asList(s.toLowerCase().split("\\W+"));

for(int i = 0; i < list.size(); i++)

{

if(!map.containsKey(list.get(i)))

{

Set newset = new HashSet<>();

newset.add(documentId);

map.put(list.get(i), newset);

}

else

{

Set set = map.get(list.get(i));

set.add(documentId);

map.put(list.get(i), set);

}

}

}

}

/**

* Returns the set of DocumentIds contained within the search engine that contain a given term.

*

* @param term

* @return the set of DocumentIds that contain a given term

*/

public Set indexLookup(String term) {

Set t = new HashSet();

String k = term.toLowerCase();

for(String doc: google.keySet()){

if(doc.contains(k)){

t.add(google.get(doc));

}

}

return t;

}

/**

* Returns the term frequency of a term in a particular document.

*

* The term frequency is number of times the term appears in a document.

*

* See

* @param documentId

* @param term

* @return the term frequency of a term in a particular document

* @throws IllegalArgumentException if the documentId has not been added to the engine

*/

public int termFrequency(DocumentId documentId, String term) throws IllegalArgumentException {

}

/**

* Returns the inverse document frequency of a term across all documents in the index.

*

* For our purposes, IDF is defined as log ((1 + N) / (1 + M)) where

* N is the number of documents in total, and M

* is the number of documents where the term appears.

*

* @param term

* @return the inverse document frequency of term

*/

public double inverseDocumentFrequency(String term) {

}

/**

* Returns a sorted list of documents, most relevant to least relevant, for the given term.

*

* A document with a larger tfidf score is more relevant than a document with a lower tfidf score.

*

* Each document in the returned list must contain the term.

*

* @param term

* @return a list of documents sorted in descending order by tfidf

*/

public List relevanceLookup(String term) {

}

I have this code, and I want to do what the comments above each methods says. I have this declared and imported:

import comparators.TfIdfComparator; import documents.DocumentId;

public Map> map = new HashMap<>();

public List list;

Please help me in the rest of the methods with accordance to the map variable. Also check if iI have the the first (addDocument and indexLookup) implemented correctly.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!