Question: Assume you are given a data set in the form of a n x m term-by document matrix X corresponding to a large collection of

Assume you are given a data set in the form of a n x m term-by document matrix X corresponding to a large collection of news articles. Precisely, the (i, j) entry in X is the frequency of the word i in the document j. We would like to visualize this data set on a two-dimensional plot. Explain how you would do to do the following (describe your steps carefully in terms of the SVD of an appropriately centered version of X).

1. Plot the different news sources as points in word space, with maximal variance of the points.

2. Plot the different words as points in news-source space, with maximal variance of the points.

Step by Step Solution

3.58 Rating (165 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

1 2 Denote by d the jth column of A which corresponds to particular news source Thus A ... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Optimization Models Questions!