Question: For this project, you'll create a word cloud from a text by writing a script. This script needs to process the text, remove punctuation, ignore

For this project, you'll "create a word cloud" from a text by writing a script. This script needs to process the text, remove punctuation, ignore case and words that do not contain all alphabets, count the frequencies, and ignore uninteresting or irrelevant words. A dictionary is the output of thecalculate_frequenciesfunction. Thewordcloudmodule will then generate the image from your dictionary.

For the input text of your script, you will need to provide a file that contains text only. For the text itself, you can copy and paste the contents of a website you like. Or you can use a site likeProject Gutenbergto find books that are available online. You could see what word clouds you can get from famous books, like a Shakespeare play or a novel by Jane Austen. Save this as a .txt file somewhere on your computer.

Now you will need to upload your input file here so that your script will be able to process it. To do the upload, you will need an uploader widget. Run the following cell to perform all the installs and imports for your word cloud script and uploader widget. It may take a minute for all of this to run and there will be a lot of output messages. But, be patient. Once you get the following final line of output, the code is done executing. Then you can continue on with the rest of the instructions for this notebook.

# Here are all the installs and imports you will need for your word cloud script and uploader widget

!pip install wordcloud

!pip install fileupload

!pip install ipywidgets

!jupyter nbextension install --py --user fileupload

!jupyter nbextension enable --py fileupload

import wordcloud

import numpy as np

from matplotlib import pyplot as plt

from IPython.display import display

import fileupload

import io

import sys

# This is the uploader widget

def _upload():

_upload_widget = fileupload.FileUploadWidget()

def _cb(change):

global file_contents

decoded = io.StringIO(change['owner'].data.decode('utf-8'))

filename = change['owner'].filename

print('Uploaded `{}` ({:.2f} kB)'.format(

filename, len(decoded.read()) / 2 **10))

file_contents = decoded.getvalue()

_upload_widget.observe(_cb, names='data')

display(_upload_widget)

_upload()

def calculate_frequencies(file_contents):

# Here is a list of punctuations and uninteresting words you can use to process your text

punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''

uninteresting_words = ["the", "a", "to", "if", "is", "it", "of", "and", "or", "an", "as", "i", "me", "my", \

"we", "our", "ours", "you", "your", "yours", "he", "she", "him", "his", "her", "hers", "its", "they", "them", \

"their", "what", "which", "who", "whom", "this", "that", "am", "are", "was", "were", "be", "been", "being", \

"have", "has", "had", "do", "does", "did", "but", "at", "by", "with", "from", "here", "when", "where", "how", \

"all", "any", "both", "each", "few", "more", "some", "such", "no", "nor", "too", "very", "can", "will", "just"]

# LEARNER CODE START HERE

#wordcloud

cloud = wordcloud.WordCloud()

cloud.generate_from_frequencies()

return cloud.to_array()

# Display your wordcloud image

myimage = calculate_frequencies(file_contents)

plt.imshow(myimage, interpolation = 'nearest')

plt.axis('off')

plt.show()

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!