Question: please answer it in python ERM classification attempt using incorrect knowledge of data distribution (Naive Bayesian Classifier, which assumes features are independent given each class

please answer it in python

please answer it in python ERM classification attempt using incorrect knowledge of

ERM classification attempt using incorrect knowledge of data distribution (Naive Bayesian Classifier, which assumes features are independent given each class label)... For this part, assume that you know the true class prior probabilities, but for some reason you think that the class conditional pdfs are both Gaussian with the true means, but (incorrectly) with covariance matrices that are diagonal (with diagonal entries equal to true variances, off-diagonal entries equal to zeros). Analyze the impact of this model mismatch by implementing the ERM classifier using this data distribution model and repeating the same steps in Part A on the same 10K sample data set you generated earlier. Report the same results, answer the same questions. Did this model mismatch negatively impact your ROC curve and minimum achievable probability of error?

The probability density function (pdf) for a 4-dimensional real-valued random vector X is as follows: p(x) = p(xL = 0)P(L = 0)+p(xL = 1)P(L = 1). Here L is the true class label that indicates which class-label-conditioned pdf generates the data. The class priors are P(L = 0) = 0.7 and P(L = 1) = 0.3. The class-conditional pdfs are p(x|L=0) = g(xmo, Co) and p(xL= 1) = g(xm ,Ci), where g(x|m,C) is a multivariate Gaus- sian probability density function with mean vector m and covariance matrix C. The parameters of the class-conditional Gaussian pdfs are: 2 -0.5 0.3 0] 1 0.3 -0.2 0] -0.5 1 -0.5 0 0.3 2 0.3 0 mo Co 0.3 C= mi -0.5 1 0 -0.2 0.3 1 0 0 0 0 2 0 0 0 3 For numerical results requested below, generate 10000 samples according to this data distribu- tion, keep track of the true class labels for each sample. Save the data and use the same data set in all cases. The probability density function (pdf) for a 4-dimensional real-valued random vector X is as follows: p(x) = p(xL = 0)P(L = 0)+p(xL = 1)P(L = 1). Here L is the true class label that indicates which class-label-conditioned pdf generates the data. The class priors are P(L = 0) = 0.7 and P(L = 1) = 0.3. The class-conditional pdfs are p(x|L=0) = g(xmo, Co) and p(xL= 1) = g(xm ,Ci), where g(x|m,C) is a multivariate Gaus- sian probability density function with mean vector m and covariance matrix C. The parameters of the class-conditional Gaussian pdfs are: 2 -0.5 0.3 0] 1 0.3 -0.2 0] -0.5 1 -0.5 0 0.3 2 0.3 0 mo Co 0.3 C= mi -0.5 1 0 -0.2 0.3 1 0 0 0 0 2 0 0 0 3 For numerical results requested below, generate 10000 samples according to this data distribu- tion, keep track of the true class labels for each sample. Save the data and use the same data set in all cases

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!