Question: Consider a two-class classification problem where the data in each class is modeled by a 2D Gaussian density N(H1, E1) and N(H2, 22). Data Generation

Consider a two-class classification problem where the data in each class

is modeled by a 2D Gaussian density N(H1, E1) and N(H2, 22). Data Generation using the parameters shown below, generate 60,000 random samples from

Consider a two-class classification problem where the data in each class is modeled by a 2D Gaussian density N(H1, E1) and N(H2, 22). Data Generation using the parameters shown below, generate 60,000 random samples from N(H1,21) and 140,000 samples from N(H2, E2) (i.e., 200,000 samples total). We will be referring to this data set as "data set A. u = -- N , Loi U2 = : -1 6 ] Notation: U= 0 0 0 0 uy Note: you will be using the Box-Muller transformation to generate the samples from each distribution; please review "Generating Gaussian Random Numbers for more information (posted on the course's webpage). A link to the C code is provided on the webpage. Since the code generates samples from a 10 Gaussian distribution, you would need to call the Box-Muller function twice to generate 2D samples (x, y); use (Hx, 0x) to generate the x sample and (My, Oy) to generate the y sample. Note: ranf() is not defined in the standard library, you could use the simple implementation: /* rant - return a random double in the [0, m] range. */ double ranf(double m) { return (m*rand()/(double)RAND_MAX); } 1. This experiment involves the samples from set A a. Design a Bayes classifier for minimum error to classify the samples from set A. Which discriminant (i.e., case I, II, or III) would you use in this experiment and why? How would you set the prior probabilities P(W1) and P(W2)? b. Plot both the Bayes decision boundary and generated samples on the same plot to better visualize how the Bayes rule would classify the data. C. Classify all 200,000 samples and report (i) the misclassification rate for each class separately (i.e., percentage of misclassified samples for each class) and (ii) the total misclassification rate (i.e., percentage of misclassified samples overall). d. Calculate the theoretical probability error (e.g., Bhattacharyya bound) and compare it with the misclassification rate from part (c). Data Generation: using the parameters shown below, generate 40,000 random samples from N(u1, E1) and 160,000 samples from N(H2,22) (i.e., 200,000 samples total). We will be referring to this data set as data set B. 1 0 4 0 , U2 = (1 0 1 0 8 2. Repeat part (1) experiments using the samples from set B. How do your results from this part compare with your results from part (1) and why? Consider a two-class classification problem where the data in each class is modeled by a 2D Gaussian density N(H1, E1) and N(H2, 22). Data Generation using the parameters shown below, generate 60,000 random samples from N(H1,21) and 140,000 samples from N(H2, E2) (i.e., 200,000 samples total). We will be referring to this data set as "data set A. u = -- N , Loi U2 = : -1 6 ] Notation: U= 0 0 0 0 uy Note: you will be using the Box-Muller transformation to generate the samples from each distribution; please review "Generating Gaussian Random Numbers for more information (posted on the course's webpage). A link to the C code is provided on the webpage. Since the code generates samples from a 10 Gaussian distribution, you would need to call the Box-Muller function twice to generate 2D samples (x, y); use (Hx, 0x) to generate the x sample and (My, Oy) to generate the y sample. Note: ranf() is not defined in the standard library, you could use the simple implementation: /* rant - return a random double in the [0, m] range. */ double ranf(double m) { return (m*rand()/(double)RAND_MAX); } 1. This experiment involves the samples from set A a. Design a Bayes classifier for minimum error to classify the samples from set A. Which discriminant (i.e., case I, II, or III) would you use in this experiment and why? How would you set the prior probabilities P(W1) and P(W2)? b. Plot both the Bayes decision boundary and generated samples on the same plot to better visualize how the Bayes rule would classify the data. C. Classify all 200,000 samples and report (i) the misclassification rate for each class separately (i.e., percentage of misclassified samples for each class) and (ii) the total misclassification rate (i.e., percentage of misclassified samples overall). d. Calculate the theoretical probability error (e.g., Bhattacharyya bound) and compare it with the misclassification rate from part (c). Data Generation: using the parameters shown below, generate 40,000 random samples from N(u1, E1) and 160,000 samples from N(H2,22) (i.e., 200,000 samples total). We will be referring to this data set as data set B. 1 0 4 0 , U2 = (1 0 1 0 8 2. Repeat part (1) experiments using the samples from set B. How do your results from this part compare with your results from part (1) and why

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Supply Chain Management Introduction Outline What is supply chain management? Significance of supply chain management. Push vs. Pull processes utdallas.edu/~metin 1 A Generic Supply Chain Sources:...

Exercises Chapter 2 2.1 Marginal and conditional probability: The social mobility data from Section 2.5 gives a joint probability distribution on (Y1 , Y2 )= (father's occupation, son's occupation)....

Corporate governance and environmental reporting: an Australian study Kathyayini Kathy Rao, Carol A. Tilt and Laurence H. Lester Kathyayini Kathy Rao is a Post Graduate Research Student and Carol A....

1. Calculate the sample size needed given these factors: one-tailed t-test with two independent groups of equal size small effect size (see Piasta, S.B., & Justice, L.M., 2010) alpha =.05 beta = .2...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

The following variables are used in an implementation of the algorithm: ar is the count of active readers rr is the count of reading readers aw is the count of active writers ww is the count of...

Probability and Statistics - Problem Set c Keith M. Chugg October 2, 2015 1 Preliminaries, Combinatorics, Set Probability 1.1. A number of bats are in a cave. 2 bats can see out of their left eye. 3...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Write 2 paragraphs about Macro risks and the term structure of interest rates article. No max word count, page count, or formatting requirements but has to be submit to my tutor's work as my own....

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

Game Play manufactures video games that it sells for $39 each. The company uses a fixed manufacturing overhead allocation rate of $6 per game. Assume all costs and production levels are exactly as...

Pot acquired 75% of Sivs equity shares on 1 st January 2019. The consideration for this purchase was 300,000 cash. On the 1 st January 2019, Siv shares had a market price of 4.20. On the same date,...

The basket, or group of currencies that constitute an SDR , is reviewed every five years by the IMF executive board and is based on the currency's role in international trade and finance. True False

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

From a Comparable Worth Standpoint, what is the situation with regard to Federal Gender-based Employee Pay Equity?

Provide an example of how drilling down further into information can yield new results.

What do Dimensions represent in OLAP Cubes?