Run a word embedding visualization tool such as projector.tensorflow.org/ and try to get a feel for how

Question:

Run a word embedding visualization tool such as projector.tensorflow.org/ and try to get a feel for how the embeddings work: what words are near each other? Are there surprises for unrelated words that are near or related words that are far apart? Report on your findings. Consider:

a. Common concrete nouns like “people” or “year” or “dog.”

b. The most common words, which carry little meaning: “the,” “be,” “of,” “to,” etc.

c. Abstract concepts like “configuration” or “topological.”

d. Words that are part of sequential series such as numbers, months, presidents, days of the week (is “Monday” closer to “Sunday” or “Tuesday” or “Friday”?).

e. Words that are part of unordered groups such as “France” or “Maine” or “lion.”

f. Ambiguous words (is “Turkey” grouped with countries or birds?).

g. UMAP versus T-SNE versus PCA visualizations.

h. What words are at the periphery of the embedding? Near the center? Is that significant? Try changing the principal components that are mapped to each of the XYZ axes in PCA and see what difference that makes.

i. What else can you explore?

Fantastic news! We've Found the answer you've been seeking!