Consider the two-class classification problem where the class label y (0; 1) and each training...

Fantastic news! We've Found the answer you've been seeking!

Question:

Transcribed Image Text:

Consider the two-class classification problem where the class label y € (0; 1) and each training example X has 2 binary attributes X₁, X₂ € (0; 1). Let the class prior be P(Y = 1) = 0.5 and let P(X₂ = 1 Y = 1) = 0.8. P(X₂ = 1 | Y= 1) = 0.5, So, attribute X₁ provides slightly stronger evidence about the class label than X₂ Assume X, and X₂ are truly independent given Y. given X₁ = x, and X₂ = x₂, write down the Naive Bayes decision rule. Fill out the following table of predictions, f(X₁, X₂) € (0.1), based on the Naive Bayes decision rule for each of the 4 settings for X₁, X₂. Please show your calculations. b. X₁ X₂ f(X₁X₂) 0 0 1 1 (c) OHOA 0 1 0 P(X₁ = 01Y = 0) = 0.7, P(X₂=0|1= 0) = 0.9. 1 For the Naive Bayes decision function f(X₁, X₂), the error rate is: I(YfX₁, X₂))P(X₁, X₂,Y) where I(Y = f(X₁X₂)) = 1ify = f(X₁, X₂) and 0 otherwise. For this question, we will assume that the true data distribution is the same as the Naive Bayes distribution, so P(X₁, X₂Y) we can be written as P(Y)P(X₁|Y)P(X₂ | Y). Show the error rate of the Naive Bayes Classifier is 0.235. c. Now, suppose that we create a new attribute X3, which is an exact copy of X₂. So, for every training example, attributes X₂and X3 have the same value, X₂ = X3. (a) What is the error rate of Naive Bayes now, using X₁, X₂, and X3? The predicted y should be computed using the assumption of conditional independence, and the error rate should be computed using the true probabilities. (b) Why does Naive Bayes perform worse with the addition of X3? Hint: Does the key assumption of Naive Bayes still hold? Now consider a logistic regression model M, with weight vector w = [w₁, W₂] that is used to predict Y using X, and X₂ and another model M', with weight vector w' = [w, w, w] that is used to predict Y using X₁, X2, and X3. What is the relation between w and w' after training both models? Will the trained logistic regression model M' suffer from the same problem as the Naive Bayes model in part (a)? Explain why or why not. Consider the two-class classification problem where the class label y € (0; 1) and each training example X has 2 binary attributes X₁, X₂ € (0; 1). Let the class prior be P(Y = 1) = 0.5 and let P(X₂ = 1 Y = 1) = 0.8. P(X₂ = 1 | Y= 1) = 0.5, So, attribute X₁ provides slightly stronger evidence about the class label than X₂ Assume X, and X₂ are truly independent given Y. given X₁ = x, and X₂ = x₂, write down the Naive Bayes decision rule. Fill out the following table of predictions, f(X₁, X₂) € (0.1), based on the Naive Bayes decision rule for each of the 4 settings for X₁, X₂. Please show your calculations. b. X₁ X₂ f(X₁X₂) 0 0 1 1 (c) OHOA 0 1 0 P(X₁ = 01Y = 0) = 0.7, P(X₂=0|1= 0) = 0.9. 1 For the Naive Bayes decision function f(X₁, X₂), the error rate is: I(YfX₁, X₂))P(X₁, X₂,Y) where I(Y = f(X₁X₂)) = 1ify = f(X₁, X₂) and 0 otherwise. For this question, we will assume that the true data distribution is the same as the Naive Bayes distribution, so P(X₁, X₂Y) we can be written as P(Y)P(X₁|Y)P(X₂ | Y). Show the error rate of the Naive Bayes Classifier is 0.235. c. Now, suppose that we create a new attribute X3, which is an exact copy of X₂. So, for every training example, attributes X₂and X3 have the same value, X₂ = X3. (a) What is the error rate of Naive Bayes now, using X₁, X₂, and X3? The predicted y should be computed using the assumption of conditional independence, and the error rate should be computed using the true probabilities. (b) Why does Naive Bayes perform worse with the addition of X3? Hint: Does the key assumption of Naive Bayes still hold? Now consider a logistic regression model M, with weight vector w = [w₁, W₂] that is used to predict Y using X, and X₂ and another model M', with weight vector w' = [w, w, w] that is used to predict Y using X₁, X2, and X3. What is the relation between w and w' after training both models? Will the trained logistic regression model M' suffer from the same problem as the Naive Bayes model in part (a)? Explain why or why not.

Related Book For answer-question

answer-question

Probability and Stochastic Processes A Friendly Introduction for Electrical and Computer Engineers

Probability and Stochastic Processes A Friendly Introduction for Electrical and Computer Engineers

ISBN: 978-1118324561

3rd edition

Authors: Roy D. Yates, David J. Goodman

See More Books

Posted Date: Feb 22, 2022 02:08 AM

See More Questions