Question: implement the following three components: AppInterface, ProcessingEngine and IndexStore. The IndexStore component is responsible with storing the DocumentMap and the TermInvertedIndex indexes and it exposes

implement the following three components: AppInterface, ProcessingEngine and IndexStore.
The IndexStore component is responsible with storing the DocumentMap and the TermInvertedIndex indexes and it exposes four services: putDocument, getDocument, updated Index and lookupIndex.
The DocumentMap index stores a mapping between the relative path of documents and a unique document number. The document number can be generated incrementally as documents get indexed by the FileRetrievalEngine, and this technique is used in order to minimize the amount of memory used by the TermInvertedIndex. The DocumentMap can be updated through the putDocument method, that receives as an input a document path and returns a unique document
number, and it can be read through the getDocument document method, that receives a document number as the input and returns the document path.
The TermInvertedIndex index stores a dictionary where the keys are words/terms extracted from the documents and the values are a list of pairs of two numbers.
The first number in each pair is the document number and the second number is the number of times the word/term appeared in the document. The TermInvertedIndex can be updated through the updateIndex method, that receives as an argument a document number and a list of pairs of terms and frequency of terms, and it can be queried through the lookupIndex method, that receives as an input a term and
returns as an output a list of pairs of document numbers and term frequencies.
The ProcessingEngine component is responsible with indexing the documents read from an input folder and with processing search commands. The ProcessingEngine will use the services provided by the IndexStore to store and access the document mapping and the term inverted index. The ProcessingEngine exposes two services: indexFolder and search.
The indexFolder method, receives as an argument an input folder path and builds an index from all of the documents found in the folder. To do this, it firstly needs to crawl the folder, create a list of document paths and call putDocument in order to receive a document number for each document. Then, it must extract all alphanumeric words [a-zA-Z0-9] that have a length greater than 2. Any non alphanumeric character is considered a delimiter. While extracting the words/terms, the program must count the number of occurrences/frequency of each unique word/term in the document. For each document, after it extracted
all the unique alphanumeric words and after it computed the frequencies, the ProcessingEngine will then update the IndexStore by calling updateIndex.
The search method, receives as an argument a list of terms from an AND query and returns as a result the list of documents that contain all of the terms and the combined number of occurences per document for all terms. Your program needs to support queries that can have at least 3 terms. To calculate the result for a search query, the ProcessingEngine needs to call for each input term lookupIndex.
Then it must combine the results for each term by implementing an intersection mechanism that has the following rules: if a document number exists in all lookup results, include it in the final result and the frequency of the final result will be calculated as the sum of frequencies between all results for each document number. Then, the ProcessingEngine needs to sort the final list of pairs of
document numbers and frequencies in descending order by frequency and keep only the top 10 results. Finally, the ProcessingEngine must call getDocument in order to get the document paths for the final results, that will be returned together with the frequencies.
The AppInterface component is responsible with implementing a command line interface that the user can use to interact with the File Retrieval Engine. The command line interface must support interpreting indexing and search commands submitted by the user, and is responsible with forwarding the commands to the ProcessingEngine and with printing the results of the commands on the screen.
The File Retrieval Engine must support the following commands:
quit: this command closes the application by gracefully.
index : this command needs to tell the File Retrieval Engine to crawl and find all the documents in the given folder path, and must build an index from those documents. Sequences of alphanumeric characters [a-zA-Z0-9] that are separated by any other nun-alphanumeric character(s) and are larger than 2 characters are considered as terms that need to be indexed.
search : user inputs the following query: cats AND dogs,the File Retrieval Engine must return all the documents that contain cats and dogs, must sort the returned documents by the total number of occurrences of both cats and dogs in each document, must return top 10 documents.
Give me code in java with Screenshot of output.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!