Question: To initialize the EM algorithm in Figure 10.8 (page 480) consider two alternatives: (a) allow P to return a random distribution the first time through
To initialize the EM algorithm in Figure 10.8 (page 480) consider two alternatives:
(a) allow P to return a random distribution the first time through the loop
(b) initialize cc and fc to random values.
By running the algorithm on some datasets, determine which, if any, of these alternatives is better in terms of log loss (page 276) of the training data, as a function of the number of loops through the dataset. Does it matter if cc and fc are not consistent with the semantics (counts that should be equal are not)?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
