Question: Your inverter will take exactly one argument: a file that contains a list of filenames. Each filename will appear on a separate line. Each of
Your inverter will take exactly one argument: a file that contains a list of filenames. Each filename will appear on a separate line.
Each of the files described in the first file will contain text that you will build your index from. For example:
inputs.txt
foo1.txt foo2.txt
foo1.txt
this is a test. cool.
foo2.txt
this is also a test. boring.
Output
Your inverter should output a string of all of the words from all of the inputs, in alphabetical order, followed by the document numbers in which they appear, in order. For example (note: your program must produce exactly this output):
a: 0 1 also: 1 boring: 1 cool: 0 is: 0 1 test: 0 1 this: 0 1
Alphabetical is defined as the order according to ascii. So The and the are separate words, and The comes first. Only certain words should be indexed. Words are anything that is made up of only alpha characters, and not numbers, spaces, etc. Th3e is two words, Th and e.
Files are incrementally numbered, starting with 0. Only valid, openable files should be included in the count. (is_open comes in handy here)
Your program should absolutely not produce any other output. Extraneous output, or output formatted incorrectly (extra spaces etc.) will make the autograder mark your solution as incorrect. Please leave yourself extra days to work these problems out.
Implementation Hints
Implement the data structure using the C++ Standard Template Library (STL) as a map of sets, as in:
map> invertedIndex;
Use C++ strings
#include
and file streams:
#include
Remember, your program needs to be robust to errors. Files may be empty, etc. Please handle these cases gracefully (silently) and with no extra output.
The noskipws operator may be useful in parsing the input:
input >> noskipws >> c;
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
