Question: Currently working on the basics of Python with Zelle's book and doing some extra exercises. The question is based on chapter 11 and covers list

Currently working on the basics of Python with Zelle's book and doing some extra exercises. The question is based on chapter 11 and covers list application and dictionary basics.

Question: Write a Python program that reads a text from standard input and subsequently keeps track of all the bigrams in the text.

All bigrams with their frequency must be written to standard output, in order of frequency.

The dictionary you need has bigrams as its key, which in Python can be represented best as a tuple of the two words.

A bigram only counts if the words are on the same line.

The text has already been tokenized: - Punctuation marks/special characters have already been separated from the words. - Therefore the punctuation marks themselves will also be considered as words. - Every line contains exactly 1 sentence. - Capital letters are irrelevant.

example.txt: This sentence contains 5 bigrams . This sentence

Output:

the command 'cat example.txt | python3 bigrams.py' should show:

This sentence 2 Sentence contains 1 contains 5 1 5 bigrams 1 bigrams . 1

What I have so far (likely to be completely wrong):

import sys

def main():

for line in sys.stdin:

words = line.split()

bigrams_list = []

for i in range(len(words) - 1): bigrams_list.append((words[i], words[i + 1]))

mydict = {}

for bigram in bigrams_list:

if bigram in mydict:

mydict[bigram] = mydict[bigram] + 1

else:

mydict=[bigram] = 1

print(mydict)

main()

Can anyone help with this?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!