Question: Ex2. Given a training set D, derive the maximum likelihood estimates of the naive Bayes. Exercise 1 [5 points]. This problem reviews basic concepts from

Ex2.

Given a training set D, derive the maximum likelihood estimates of the naive Bayes.

Ex2. Given a training set D, derive the maximum likelihood estimates of

the naive Bayes. Exercise 1 [5 points]. This problem reviews basic concepts

Exercise 1 [5 points]. This problem reviews basic concepts from probability. a) [1 point]. A biased die has the following probabilities of landing on each face: face 1 2 3 4 5 6 P(face) 1 . 1 .2 .2 .4 0 I win if the die shows even. What is the probability that I win? Is this better or worse than a fair die (i.e., a die with equal probabilities for each face)? b) [1 point]. Recall that the expected value E[X] for a random variable X is E[X] = EP(X = 1) I, TEA where \\' is the set of values X may take on. Similarly, the expected value of any function f of random variable X is ES(X)] = _P(X = x) f(x). TEA Now consider the function below, which we call the "indicator function" 1 if X = a I[X = ] jo ixfa Let X be a random variable which takes on the values 3, 8 or 9 with probabilities pa, ps and po respectively. Calculate E[I [X = 8]].Exercise 2 [5 points]. Given a training set D = {(x),y) ), i = 1, ..., M}, where I() E RN and y() E {1, 2,..., C}, derive the maximum likelihood estimates of the naive Bayes for real valued r." modeled with a Laplacian distribution, i.e., p(x; ly = c) = H exp ; - Hill 20 jlcc) [2 points]. Recall the following definitions: . Entropy: H(X) = - Drex P(X = x) log2 p(X = x) = -E[log, p(X)] . Joint entropy: H(X, Y) = -ExEX Drey P(X = x, Y = y/) log2 p(X = x, Y =y) = -E[log, p(X, Y)] . Conditional entropy: H(Y X) = -Drex Dyeyp(X - x, Y = v) log2 p(Y = yX = x) = -E[log2 p(Y|X")] . Mutual information: I(X; Y) = Drex DreyP(X = x, Y = y) log, PASSYET 2 p(X=D)p(Y=y) Using the definitions of the entropy, joint entropy, and conditional entropy, prove the following chain rule for the entropy: H(X, Y ) = H(Y) + H(X|Y). d) [1 point]. Recall that two random variables X and Y are independent if for all re X and all y e ), p(X = x,Y = y) = p(X =x)p(Y =y). If variables X and Y are independent, is I(X; Y) = 0? If yes, prove it. If no, give a counter example

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Exercise 2 [5 points]. Given a training set D = {(x(), y(?) ), i = 1, ..., M}, where a() E RN and y(?) c {1, 2, ..., C}, derive the maximum likelihood estimates of the naive Bayes for real valued ,"...

Exercise 2 E [4 points]. Given a training set D = {(x(i), y(i)), i = 1, .., M}, where x() RN and y() {1,2, ..., C}, derive the maximum likelihood estimates of the naive Bayes for real valued xmodeled...

CSC 411 / CSC 2515 Introduction to Machine Learning ASSIGNMENT # 1 Due at NOON on: Oct. 19 (CSC 411) / Oct. 20 (CSC 2515) 1 Logistic Regression (40 points) 1.1 (10 points) Bayes' Rule Suppose you...

IOE 419 Mark S. Daskin Service Operations Management IOE Department Winter, 2017 University of Michigan Problem set 4 DUE: MONDAY - February 20, 2017 Points: 100 points total Problem 1: Babette has...

Let's first refresh on the application of Bayes' Rule to models and data in ML. Given a training dataset D and a model 0, the posterior distribution is defined as Pr(0 | D) = "posterior" "likelihood"...

Need help getting started on these questions. I am supposed to add code where it says "implement me" and write the answer where it says answer in one or two line. Need to fill in the "Implement me"...

Algorithms in Artificial Intelligence (or, the old name: Introduction to Algorithmic Decision Making) Part 1 Based on slides by David Sarne and Lirong Xia Course Tentative Schedule Introduction...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Hi, math experts. I am recently learning from Pattern Recognition and Machine Learning, Chris Bishop. Please provide a detailed explainations briefly for the following sections. 1. Section 1.2.2...

Exercises Chapter 2 2.1 Marginal and conditional probability: The social mobility data from Section 2.5 gives a joint probability distribution on (Y1 , Y2 )= (father's occupation, son's occupation)....

Consider a competitive firm that produces bots. Labor (L) and capital (K) are the only two inputs of production; each unit of labor is paid the market wage (w), and each unit of capital is rented at...

Suppose you are interested in an investment with an uncertain return. You think that the return could be modeled as a normal random variable with mean $2,000 and standard deviation $1,500. What is...

If investors expect interest rates to rise, they should sell preferred stock and buy bonds. True False

On 1 March 2007 DB Limited issued R560 000 15% debentures at R98. The debentures were to be redeemed at par in four equal annual payments starting 28 February 2010. Required: Journalise the above...