Question: In this problem, you will use the data and scenario described in this chapters example, in which the task is to develop a model to

In this problem, you will use the data and scenario described in this chapter’s example, in which the task is to develop a model to classify documents as either auto-related or electronics-related.

a. Using the process shown in Figure 21.6, store the data as an ExampleSet. Then, load the data in a new process and create a label vector.

b. Following the example in this chapter, preprocess the documents. Explain what would be different if you did not perform the “stemming” step.

c. Use the LSA to create 10 concepts. Explain what is different about the concept matrix, as opposed to the TF-IDF matrix.

d. Using this matrix, fit a predictive model (different from the model presented in the chapter illustration) to classify documents as autos or electronics. Compare its performance with that of the model presented in the chapter illustration.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Essentials Business Analytics Questions!