Question: Fix some distribution D on X, and say the examples are labeled by an unknown c drawn from C. For a hypothesis (i.e. function) h
Fix some distribution D on X, and say the examples are labeled by an unknown c drawn from C. For a hypothesis (i.e. function) h : X --> Y , let err(h) = Px~D[h(x) is not equal to c(x)].
(a) Fix a hypothesis h : X --> Y . If err(h) > epsilon, what is the probability that h gets k random examples all correct? How large does k need to be for this probability to be at most delta_prime?
(b) As we feed examples to A, how many examples do we need to see before we can be sure of getting a block of k examples all correct? (This doesn't mean the hypothesis needs to be perfect; it just needs to get a block of k all correct. Think about dividing the stream of examples into blocks of size k, and exploit the mistake bound. How many different hypotheses could A go through?)
(c) Put everything together and fully describe (with proof) a PAC learner that is able, with probability of failure at most delta, to output a hypothesis with error at most epsilon. How many examples does the learner need to use (as a function of epsilon, delta, and t)?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
