Question: Let X be an (n p) data matrix in which each row corresponds to a p- variate measurement on one of n individuals. Assuming that

Let X be an (n p) data matrix in which each row corresponds to a p- variate measurement on one of n individuals. Assuming that the p variates are continuous variables describe three possible measures of dissimilarity of pairs of individuals. Comment on their relative advantages and disadvantages. [3] (b) What four properties must be satisfied for a dissimilarity function to be a metric dissimilarity coefficient? [2] The values of four binary variables are measured for each of four individuals as follows: Individual Variable 1 2 3 4 1 1 1 1 0 2 0 0 1 1 3 1 1 1 1 4 0 1 0 1 Construct a dissimilarity matrix for the four individuals using (i) the simple matching coefficient and (ii) Jaccards coefficient. [4] If Srt denotes the simple matching coefficient show that drt = 1 Srt is a metric dissimilarity coefficient. [4] (c) Five subjects were each given three psychological tests. The scores for each subject on each test were recorded and the Euclidean distances between each pair of subjects were calculated as follows: Subject A B C D E A 0 - - - - B 4.2 0 - - - C 5.9 7.6 0 - - D 1.2 7.0 10.3 0 - E 6.1 2.6 5.4 7.8 0 Using single-link clustering, cluster the five subjects. Sketch the dendrogram and interpret the results. [4] How would your dendrogram change if you used a complete-link clustering algo- rithm? [3]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Chemical Engineering Questions!