Question: IN PYTHON (Project: Visualizing Word Frequencies with a Word Cloud) A word cloud visual- izes words, displaying more frequently occurring words in larger fonts. In
IN PYTHON
(Project: Visualizing Word Frequencies with a Word Cloud) A word cloud visual- izes words, displaying more frequently occurring words in larger fonts. In this exercise, youll create a word cloud that visualizes the top 200 words in Pride and Prejudice. Youll use the open-source wordcloud modules15 WordCloud class to generate a word cloud with just a few lines of code.
To install wordcloud, open your Anaconda Prompt (Windows), Terminal (macOS/ Linux) or shell (Linux) and enter the command:
conda install -c conda-forge wordcloud You create and configure a WordCloud object as follows:
from wordcloud import WordCloud wordcloud = WordCloud(colormap='prism', background_color='white')
Using the techniques from the previous exercise, create a frequencies dictionary contain- ing the frequencies of the top-200 words in Pride and Prejudice. Then execute the follow- ing statements to generate a rectangular word cloud and save its image to a file on disk:
wordcloud = wordcloud.fit_words(frequencies) wordcloud = wordcloud.to_file('PrideAndPrejudice.png') You can then double-click the PrideAndPrejudice.png image file on your system to view it. In the Natural Language Processing chapter, well show you how to place your word clouds into shapes. For example, we placed our Romeo and Juliet word cloud into a heart.
previous exercise:
def main(): print("This program analyzes word frequency in a file") print("and prints a report on the n most frequent words. ") # get the sequence of words from the file fname = input("File to analyze: ") text = open(fname,'r').read() text = text.lower() for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~': text = text.replace(ch, ' ') words = text.split()
#create a set of stopwords from the stopWordsEng.txt with open("stopWordsEng.txt", "r") as f: stopwords = set(f.read().split()) #count words counts = {} for w in words: if w not in stopwords: counts[w] = counts.get(w,0) + 1
#sorting n = int(input("Output analysis of how many words? ")) items = list(counts.items()) items.sort(key=lambda x: x[1], reverse=True) for i in range(n): print("{}:\t{}".format(items[i][0], items[i][1]))
main()
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
