Question: Question : Consider two samples of data: 542 A Course in Statistics with R 15.2.1 Discrimination Analysis Suppose that there are two groups characterized by

Question: Consider two samples of data:

Question: Consider two samples of data: 542 A Course in Statistics withR 15.2.1 Discrimination Analysis Suppose that there are two groups characterized bytwo multivariate normal distributions: No(M1, E) and No(M2, 2). It is assumed

542 A Course in Statistics with R 15.2.1 Discrimination Analysis Suppose that there are two groups characterized by two multivariate normal distributions: No(M1, E) and No(M2, 2). It is assumed that the variance-covariance matrix _ is the same for both the groups. Assume that we have n, observations X11, X12, ... , Xin, from Np (H1, E) and n2 observations X21, X22, ... , X2n, from Np(H2, E). The discriminant function is a linear com- bination of the p variables, which will maximize the distance between the two group's mean vectors. Thus, we are seeking a vector a, which achieves the required objective As a first step, the n + n2 vectors are transformed to n + n2 scalars through a as below: Z1i = a'Xji> i=1, ... , n, Z2i = a'X2i, i=1, ... , n2. (15.1) Define the means of the transformed scalars and the pooled variance as below: Zizi axli = a'*1, n1 12 axzi n2 = a'*2, (n - 1)S, + (n2 - 1)$2 n1 + n2 - 2 , exists iff n1 + n2 - 2> p. (15.2) Since the goal is to find that a which maximizes the distance between the group means, the problem is to maximize the squared distance: {a'(x1 - 82)}2 a'Spla (15.3) The maximum of the squared distance occurs at a given by a = Spi ( x 1 - 82 ). (15.4) An illustration of the discriminant analysis steps is done through the next example. Example 15.2.1. Discriminant Function for the "setosa" species in Iris Data. Suppose that based on the four variables of sepal length and width, and petal length and width, we need to find a which will maximize the distance between the two groups: "setosa" and "not a setosa" species. The formulas are clearly illustrated in the following R program. data (iris) V xlbar table (iris$Species) setosa versicolor virginica 50 50 50 > S_pl x1bar; x2bar; S_pl Sepal . Length Sepal . Width Petal . Length Petal . Width 5 . 006 3 . 428 1. 462 0 . 246 Sepal . Length Sepal . Width Petal . Length Petal . Width 6. 262 2. 872 4. 906 1. 676 Sepal . Length Sepal . Width Petal . Length Petal . Width Sepal . Length 0. 3350257 0. 11456216 0. 30867703 0. 11523649 Sepal . Width 0 . 1145622 0 . 12163784 0 . 09939189 0 . 05661081 Petal . Length 0. 3086770 0 . 09939189 0 . 46590676 0. 19514730 Petal . Width 0 . 1152365 0 . 05661081 0 . 19514730 0. 12436892 > solve (S_pl) [, 1] Sepal . Length 3. 186486 Sepal . Width 11 . 719430 Petal . Length -10. 841575 Petal . Width - 2. 773537 Thus, the discriminant function is given by z = -3.186486x] + 11.719430x2 - 10.841575x3 - 2.773537x4. 0 The use of the discriminant function for classification is considered next. 15.2.2 Classification Let Xnew be a new vector of observation. The goal is to classify it into one of the groups by using the discriminant function. The simple, and fairly obvious, technique is to first obtain the discriminant score by Znew = a'Xnew = (81 - $2)'Spl Xnew- Next, classify Xnew to group 1 or 2 accordingly, as Znew is closer to z1 or Z2. A simple illustration is done next. Example 15.2.2. Classification for Iris Data. The above description is captured in the next R program. We simply verify if the original observations are correctly identified by the dis- criminant function or not. > a z2bar for (i in 1:150) { + mynew abs (mynew-z2bar) , + "not setosa", "setosa") + > pred_gr3 7 6 9 X1 = 2 4 ,X2 = 5 7] 4 7 4 8 Find the discriminant function and use it (according to page 543) to classify a new data point xnew, = [2 7:" Page 543 below

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!