Question: Use Python Programing WORD & LINE CONCORDANCE APPLICATION The goal of this assignment is to process a textual, data file (WarAndPeace.txt) to generate a word
Use Python Programing

WORD & LINE CONCORDANCE APPLICATION The goal of this assignment is to process a textual, data file (WarAndPeace.txt) to generate a word concordance with line numbers for each main word. A dictionary ADT is perfect to store the word concordance with the word being the dictionary key and a Python list of its line numbers being the associated value with the key. Since the concordance should only keep track of the "main" words, there will actually be a second stop-words file (stop-words. txt). The stop-words file will contain a list of stop words (e.g., a", "the", etc.) _ these words will not be included in the concordance even if they do appear in the data file. Sample files might be Sample stop_words_small.txt file Sample hw6small.txt file Sample output file This is a sample data (text) file to be processed by your word-concordance program bigger: 4 concordance: 2 data: 1 4 file: 1 4 much: 4 processed: 2 program: 2 real: 4 sample: 1 text: 1 word: 2 your 2 about be by can do The real data file is much bigger Notes: 1) Words are defined to be sequences of letters delimited by any non-letter in (e.g. white space, punctuation, parentheses, dashes, double quotes, etc.) (e.g.. "CAT" is the same word as "cat") (e.g., line 3 above is blank) 2) There is to be no distinction made between upper and lower case letters. on the this to 3) Blank lines are to be counted in the line numbering. The general algorithm for the word-concordance program is 1) Read the stop_words_small.txt (or stop_words.txt) file into a dictionary (use the same type of dictionary that you're timing) containing only stop words, called stopWordDict. (WARNING: Strip the newline n') character from the end of the stop word before adding it to stopWordDict) 2) Process the hw6smal.txt (or WarAndPeace.txt) file one line at a time to build the word-concordance dictionary (called wordcConcordanceDict) containing "main" words for the keys with a list of their associated line numbers as their values. The main loop is something like lineCounter= 1 for each line in the data file do processLine lineCounter, line, wordConcordanceDict... lineCounter +-| 3) Traverse the wordConcordanceDict alphabetically by key to generate a text file containing the concordance words printed out in alphabetical order along with their corresponding line numbers The general algorithm for the processLine (lineCounter,ne, wordConcordanceDic function is wordList = create word List( line ) for each word in the wordList do if the word is not in the stop WordDict then if the word is in the wordConcordanceDict then look up the line-#-list value associated with the word in the wordConcordanceDict append the lineCounter to the end of the line-#-list se add the word with an associated [lineCounter list value to the wordConcordanceDict (Note: I strongly suggested that the logic for reading words and assigning line numbers to them be developed and tested separately from other aspects of the program. This could be accomplished by reading a sample file and printing out the words recognized with their corresponding line numbers without any other word processing.)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
