Question: gastext.csv using Rstudio / rscript . A fuel company has 2 5 0 + gas stations in the US . It captures customers' comments via

gastext.csv using Rstudio/rscript. A fuel company has 250+ gas stations in the US. It captures customers' comments via phone, which are merged with numeric variables by matching them with the company's royalty card number. All data were provided in the Gas fext numeric data file, Some of the text comments, variable names, and descriptions were disguised to protect the identity of the client company.
The target variable is identified by the column name.
Cust_ID, and Loyal_Status are nominal variables, and all other variables are binary.
Comment column contains the text information.
Variable and model naming requirements:
Please include your name initials to the data frame names as well as model names in your R coding.
Please instance, in my coding, 1 would name the data frames as dfKZ,dfKZ,train, and diKZ.valid. I would also name the models as treekZ, etc.
Canvas submission. You need to submit two separate documents via the Canvas HW1 submission link:
Word document: please provide your answers in the Word document, and copy/paste your R codes at the end of the document.
R coding file.
Grading Criteria:
5 pts for minor errors
10 pts for major errors
Questions:
Provide the word cloud after all necessary pre-processing. (15pts)
What are the top 5 terms that are most related to "price"? Please specify your similarity measurement method and detailed results. (10 pts)
What are the top 5 terms that are most related to "service"? Please specify your similarity measurement method and detailed results. (10 pts)
Perform topic modeling with 4 topics
Further remove some common words, such as "shower" & "point"
You might encounter the issue with all zero rows, and you need to remove those all zero rows. Here are some sample codes for your referenceProvide the term/beat plots for four topics. (10 pts)Please summarize those four topics based on your best effort (10 pts)
Please run two decision tree models
Do we need to remove any column from predictive modeling? (5 pts)
Model 1 only uses non-text information (i.e., all other columns except the Comment column)
Please show the tree plot (10pts
Model 2 combines both non-text and text information
Text mine the Comment column
Apply SVD to extract text information from the Comment column
Keep the number of SVD as 8
Combine 8 SVD with all other columns except the Comment column
Please show the tree plot (20pts
Please compare the model performance of two models based on the confusion matrix of the validation dataset (5pts)
Please copy and paste your R codes in your WORD submission. (5 pts)
Hints:
Sample code to convert multiple columns into factors: df [,3:13]-
lapply (de [,3:13], factor)
gastext.csv using Rstudio / rscript . A fuel

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!