Consider a binary classification problem where there is a single feature X ER and the depen-...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Consider a binary classification problem where there is a single feature X ER and the depen- dent variable Y = {0, 1}. Let Px,y denote the joint distribution over pairs (X,Y), and let h: R {0, 1} denote a generic classifier. We define the error rate of h as R(h) := Pr(Y #h(X)), where the probability is computed from the joint distribution Px,y. Suppose that we collect data (x1, y₁),..., (xn, Yn), which are assumed to be independent and identically distributed from the distribution Px,y, and that we train a classifier ĥ using this data. Below we consider two precise specifications of the joint distribution Px,y. In both cases, derive the largest numerical value * that you can for which it holds that R(h) ≥ e*. Carefully explain how you arrived at your specific value of ɛ* in both cases. a) (10 points) X is restricted to the set {0,1} (i.e., is categorical) and Px,y is given by the distribution in Table 1. b) (10 points) The marginal distribution of Y is given by Pr(Y = 0) = 0.3 and Pr(Y = 1) = 0.7. Given that Y = 0, the distribution of X is normal with mean 5 and variance 2. Given that Y = 1, the distribution of X is normal with mean −3 and variance 2. (It is OK for the final answer to be written in terms of the CDF of a normal distribution and/or to numerically approximate this number up to 5 digits) Table 1 Outcome (X=0, Y=0) (X=0, Y = 1) (X= 1, Y = 0) (X = 1, Y = 1) Probability 0.1 0.2 0.4 0.3 Consider a binary classification problem where there is a single feature X ER and the depen- dent variable Y = {0, 1}. Let Px,y denote the joint distribution over pairs (X,Y), and let h: R {0, 1} denote a generic classifier. We define the error rate of h as R(h) := Pr(Y #h(X)), where the probability is computed from the joint distribution Px,y. Suppose that we collect data (x1, y₁),..., (xn, Yn), which are assumed to be independent and identically distributed from the distribution Px,y, and that we train a classifier ĥ using this data. Below we consider two precise specifications of the joint distribution Px,y. In both cases, derive the largest numerical value * that you can for which it holds that R(h) ≥ e*. Carefully explain how you arrived at your specific value of ɛ* in both cases. a) (10 points) X is restricted to the set {0,1} (i.e., is categorical) and Px,y is given by the distribution in Table 1. b) (10 points) The marginal distribution of Y is given by Pr(Y = 0) = 0.3 and Pr(Y = 1) = 0.7. Given that Y = 0, the distribution of X is normal with mean 5 and variance 2. Given that Y = 1, the distribution of X is normal with mean −3 and variance 2. (It is OK for the final answer to be written in terms of the CDF of a normal distribution and/or to numerically approximate this number up to 5 digits) Table 1 Outcome (X=0, Y=0) (X=0, Y = 1) (X= 1, Y = 0) (X = 1, Y = 1) Probability 0.1 0.2 0.4 0.3 Consider a binary classification problem where there is a single feature X ER and the depen- dent variable Y = {0, 1}. Let Px,y denote the joint distribution over pairs (X,Y), and let h: R {0, 1} denote a generic classifier. We define the error rate of h as R(h) := Pr(Y #h(X)), where the probability is computed from the joint distribution Px,y. Suppose that we collect data (x1, y₁),..., (xn, Yn), which are assumed to be independent and identically distributed from the distribution Px,y, and that we train a classifier ĥ using this data. Below we consider two precise specifications of the joint distribution Px,y. In both cases, derive the largest numerical value * that you can for which it holds that R(h) ≥ e*. Carefully explain how you arrived at your specific value of ɛ* in both cases. a) (10 points) X is restricted to the set {0,1} (i.e., is categorical) and Px,y is given by the distribution in Table 1. b) (10 points) The marginal distribution of Y is given by Pr(Y = 0) = 0.3 and Pr(Y = 1) = 0.7. Given that Y = 0, the distribution of X is normal with mean 5 and variance 2. Given that Y = 1, the distribution of X is normal with mean −3 and variance 2. (It is OK for the final answer to be written in terms of the CDF of a normal distribution and/or to numerically approximate this number up to 5 digits) Table 1 Outcome (X=0, Y=0) (X=0, Y = 1) (X= 1, Y = 0) (X = 1, Y = 1) Probability 0.1 0.2 0.4 0.3 Consider a binary classification problem where there is a single feature X ER and the depen- dent variable Y = {0, 1}. Let Px,y denote the joint distribution over pairs (X,Y), and let h: R {0, 1} denote a generic classifier. We define the error rate of h as R(h) := Pr(Y #h(X)), where the probability is computed from the joint distribution Px,y. Suppose that we collect data (x1, y₁),..., (xn, Yn), which are assumed to be independent and identically distributed from the distribution Px,y, and that we train a classifier ĥ using this data. Below we consider two precise specifications of the joint distribution Px,y. In both cases, derive the largest numerical value * that you can for which it holds that R(h) ≥ e*. Carefully explain how you arrived at your specific value of ɛ* in both cases. a) (10 points) X is restricted to the set {0,1} (i.e., is categorical) and Px,y is given by the distribution in Table 1. b) (10 points) The marginal distribution of Y is given by Pr(Y = 0) = 0.3 and Pr(Y = 1) = 0.7. Given that Y = 0, the distribution of X is normal with mean 5 and variance 2. Given that Y = 1, the distribution of X is normal with mean −3 and variance 2. (It is OK for the final answer to be written in terms of the CDF of a normal distribution and/or to numerically approximate this number up to 5 digits) Table 1 Outcome (X=0, Y=0) (X=0, Y = 1) (X= 1, Y = 0) (X = 1, Y = 1) Probability 0.1 0.2 0.4 0.3 Consider a binary classification problem where there is a single feature X ER and the depen- dent variable Y = {0, 1}. Let Px,y denote the joint distribution over pairs (X,Y), and let h: R {0, 1} denote a generic classifier. We define the error rate of h as R(h) := Pr(Y #h(X)), where the probability is computed from the joint distribution Px,y. Suppose that we collect data (x1, y₁),..., (xn, Yn), which are assumed to be independent and identically distributed from the distribution Px,y, and that we train a classifier ĥ using this data. Below we consider two precise specifications of the joint distribution Px,y. In both cases, derive the largest numerical value * that you can for which it holds that R(h) ≥ e*. Carefully explain how you arrived at your specific value of ɛ* in both cases. a) (10 points) X is restricted to the set {0,1} (i.e., is categorical) and Px,y is given by the distribution in Table 1. b) (10 points) The marginal distribution of Y is given by Pr(Y = 0) = 0.3 and Pr(Y = 1) = 0.7. Given that Y = 0, the distribution of X is normal with mean 5 and variance 2. Given that Y = 1, the distribution of X is normal with mean −3 and variance 2. (It is OK for the final answer to be written in terms of the CDF of a normal distribution and/or to numerically approximate this number up to 5 digits) Table 1 Outcome (X=0, Y=0) (X=0, Y = 1) (X= 1, Y = 0) (X = 1, Y = 1) Probability 0.1 0.2 0.4 0.3 Consider a binary classification problem where there is a single feature X ER and the depen- dent variable Y = {0, 1}. Let Px,y denote the joint distribution over pairs (X,Y), and let h: R {0, 1} denote a generic classifier. We define the error rate of h as R(h) := Pr(Y #h(X)), where the probability is computed from the joint distribution Px,y. Suppose that we collect data (x1, y₁),..., (xn, Yn), which are assumed to be independent and identically distributed from the distribution Px,y, and that we train a classifier ĥ using this data. Below we consider two precise specifications of the joint distribution Px,y. In both cases, derive the largest numerical value * that you can for which it holds that R(h) ≥ e*. Carefully explain how you arrived at your specific value of ɛ* in both cases. a) (10 points) X is restricted to the set {0,1} (i.e., is categorical) and Px,y is given by the distribution in Table 1. b) (10 points) The marginal distribution of Y is given by Pr(Y = 0) = 0.3 and Pr(Y = 1) = 0.7. Given that Y = 0, the distribution of X is normal with mean 5 and variance 2. Given that Y = 1, the distribution of X is normal with mean −3 and variance 2. (It is OK for the final answer to be written in terms of the CDF of a normal distribution and/or to numerically approximate this number up to 5 digits) Table 1 Outcome (X=0, Y=0) (X=0, Y = 1) (X= 1, Y = 0) (X = 1, Y = 1) Probability 0.1 0.2 0.4 0.3
Expert Answer:
Answer rating: 100% (QA)
To find the largest numerical value for which the error rate of the classifier is greater than we need to calculate the error rate under two different ... View the full answer
Related Book For
Understandable Statistics Concepts And Methods
ISBN: 9781337119917
12th Edition
Authors: Charles Henry Brase, Corrinne Pellillo Brase
Posted Date:
Students also viewed these programming questions
-
A random sample of nine pairs of measurements is shown in the following table (saved in the LM14_40 file). a. Use the Wilcoxon signed rank test to determine wheth er the data provide sufficient...
-
Human Development/Life Span Group Presentation Each Group will choose a segment of the human life span that is particularly interesting to them: Adolescence (13 years through about 17 years). Based...
-
QUESTION 21 Which of the following is not a wrapper class? A. String B. Integer C. Character D. Double QUESTION 22 The conversion of an object of a wrapper class to a value of its associated...
-
You deposit $12,000 annually into a life insurance fund for the next 10 years, at which time you plan to retire. Instead of a lump sum, you wish to receive annuities for the next 20 years. What is...
-
(a) How is the graph of y = 2 sin x related to the graph of y = sin x ? Use your answer and Figure 6 to sketch the graph of y = 2 sin x. (b) How is the graph of y = 1 + x related to the graph of? Use...
-
eBay stock is currently at $54. It is expected to rise 9% or fall 7% in each of two 3-month periods. Suppose it rises 9% during both periods. The risk-free interest rate is 0.5% per 3 month period....
-
The case of flow past a cylinder of infinite length normal to the axis was also studied by Stokes. In view of the 2D nature of the problem, it is more convenient to work in \(r, \theta\) coordinates....
-
Ekman Company issued $1,000,000, 10-year bonds and agreed to make annual sinking fund deposits of $78,000. The deposits are made at the end of each year into an account paying 5% annual interest....
-
Consider a reservoir filled with water of uniform density po and subject to the gravi- tational force. One side of the reservoir is confined by a dam wall of height h and width W, as shown in the...
-
Q1. Study the Statement of Stockholders Equity for the five years presented. a. During fiscal year ended (FYE) 6/30/2003 stockholders equity increased primarily as a result of (_______________ /...
-
2 The partial graph of y f(z) is shown below, where (y Isy2-7, yER) The range of y f(2)-5 is dol22 yER) O(yly S2, yER) O(yly2-12 yR} OtylyS-12 yCR)
-
Roland worked for Sorbonne Company for the first four months of 2017 and earned $40,000 from which his employer withheld $3,060 for payroll taxes. In May, Roland accepted a job with Lyon Company....
-
Janes sister-in-law, a stockbroker at Invest, Inc., is trying to get Jane to buy the stock of HealthWest, a regional HMO. The stock has a current market price of $25, its last dividend (D 0 ) was...
-
Assume the risk-free rate is 6 percent and the market risk premium is 6 percent. The stock of Clinicians Care Alliance (PCA) has a beta of 1.5. The last dividend paid by PCA (D 0 ) was $2 per share....
-
Lauren is single, age 60, and has an annual salary of $68,000. She paid off her mortgage in December 2016 but expects that her annual real estate taxes will continue to be approximately $5,800....
-
An investor is considering buying the stock of two home health companies that are similar in all respects except the proportion of earnings paid out as dividends. Both companies are expected to earn...
-
Question 8: What types of control confirmations are used for tests of controls when auditing accounts receivable?
-
What impact has the Internet had on the globalization of small firms? How do you think small companies will use the Internet for business in the future?
-
Wild irises are beautiful flowers found throughout the United States, Canada, and northern Europe. This problem concerns the length of the sepal (leaf-like part covering the flower) of different...
-
The fan blades on commercial jet engines must be replaced when wear on these parts indicates too much variability to pass inspection. If a single fan blade broke during operation, it could severely...
-
Police are tested for their ability to correctly recognize and identify a suspect based on a witness or victims verbal description of the suspect. Scores on the identification test range from 0 to...
-
A food processor claims that at most \(10 \%\) of her jars of instant coffee contain less coffee than claimed on the label. To test this claim, 16 jars of her instant coffee are randomly selected and...
-
Refer to Exercise 4.2. (a) Determine the cumulative probability distribution \(F(x)\). (b) Graph the probability distribution of \(f(x)\) as a bar chart and below it graph \(F(x)\). Data From...
-
Four emergency radios are available for rescue workers but one does not work properly. Two randomly selected radios are taken on a rescue mission. Let \(X\) be the number that work properly between...
Study smarter with the SolutionInn App