Question: 1 . Read the data ( all the files in the data directory ) into an RDD using the function textFile 2 . Take only
Read the data all the files in the data directory into an RDD using the function textFile
Take only the text part of each file and count the frequency of all the words convert the text into lowercase
Remove Filter any word whose frequency is less than
Report the following
The total sizethe word count of the output dataAfter filtering
the five most frequent words in all files.
The word with maximum frequency for each file Individually
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
