Question: 9. Mixture model Instead of modeling documents with multinomial distributions, we may also model documents with multi- pie multivariate Bernoulli distributions where we would represent


9. Mixture model Instead of modeling documents with multinomial distributions, we may also model documents with multi- pie multivariate Bernoulli distributions where we would represent each document D as a bit vector indicat- ing whether a word occurs or does not occur in the document. Specically, suppose our vocabulary set is V = {101, ...,wN} with N words. A document D will be represented as D = (d1,d2,...,dN), where di 6 {0, 1} indicates whether word wz- is observed (011- : 1) in D or not (di 2 0). Suppose we have a collection of documents 0 2 {D1, ..., D M}, and we would like to model all the documents with a mixture model with two multiVariate Bernoulli distributions 81 and 62. Each of them has N parameters corresponding to the probability that each word would show up in a document. For example, p(w1~ = 1 |61) means the probability that word wi would show up when using 61 to generate a document. Similarly, p(wi : 0|o91) means the probability that word 1111- would NOT show up when using 91 to generate a document. Thus, p(w,- = 0|6i) + p(w, 2 HQ.) 2 1. Suppose we choose 61 with probability A1 and 62 with probability A2 (thus A1 + A2 = 1). (a) Write down the loglikelihood function for p(D) given such a two-component mixture model. (b) Suppose A1 and A2 are xed to constants. Write down the E-step and Mstep formulas for estimating 61 and 62 using the Maximum Likelihood estimator
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
