Question: a. Use the function system.time to measure how long it takes to remove the stop words using the method from the slides. Run the line
a. Use the function system.time to measure how long it takes to remove the stop words using the method from the slides. Run the line of code 100 times to get a good estimate. Provide the time it takes to run these 100 replicates, in seconds. The code is as follows:
system.time(for(i in 1:100)
unlist(x)[!(unlist(x) %in% stopwords('en'))])
b. Working from the method from the slides for removing stop words, modify the lines of code so that the stop words are not removed before calling table. Instead, write code so that the stop words are removed from the table (an R list) after computing the counts of each word. Provide the code.
c. Time this new code, and provide the time (in seconds) required to run this new code 100 times. Which method is faster?
The method of removing the stop words from the slides:
Thank you.
In NLP, words that do not convey semantics are known as stop words: "the, a, in, ..." Remove stop words: X = strsplit(tolower(s), '\\s+') X= unlist(x)[!(unlist(x) %in% stopwords('en'))] one thou In NLP, words that do not convey semantics are known as stop words: "the, a, in, ..." Remove stop words: X = strsplit(tolower(s), '\\s+') X= unlist(x)[!(unlist(x) %in% stopwords('en'))] one thou
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
