Question: Show your work. Include any code snippets you used to generate an answer, using comments in the code to clearly indicate which problem corresponds to




Show your work. Include any code snippets you used to generate an answer, using comments in the code to clearly indicate which problem corresponds to which code. Consider the following data matrix: X X X red yes North X2 blue no South X3 yellow no East *4 yellow no West x's red yes North X6 yellow yes North X7 blue no West 1. [4 points] Use one-hot encoding to transform all the categorical attributes to numerical values. Write down the transformed data matrix. Call this new matrix Y. 2. [2 points] What is the Euclidean distance between data instance x2 (second row) and data instance x, (seventh row) after applying one-hot encoding? 3. [2 points] What is the cosine similarity (cosine of the angle) between data instance x2 and data instance x, after applying one-hot encoding? 4. [2 points] What is the Hamming distance between data instance X, and data instance Xy? 5. [2 points] What is the Jaccard coefficient between data instance x2 and data instance X7 after applying one-hot encoding? 6. [2 points] What is the (multivariate) mean of Y? 7. [2 points) What is the sample variance of the first column of Y (using the matrix written in the answer to (1))? 8. [4 points] Write down the resulting matrix after applying standard (z-score) normalization to the matrix Y. Call this matrix Z. 9. [2 points] What is the multivariate) mean of Z? 10. [2 points] Let z; be the ith row of Z. What is the Euclidean distance between Z, and 27
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
