Question: A common problem in machine learning is to find hypotheses that explain the data as well as possible. Lets consider a simple but important instance:

A common problem in machine learning is to find hypotheses that explain the data as well as possible. Lets consider a simple but important instance: modelling coin flips. Suppose you flip a coin 20 times and you observe 14 heads and 6 tails. You assume that the coin flips are mutually independent, and that the chance of getting heads on any given toss is some probability p between 0 and 1 (inclusive). Which value of p best explains the data?

For a fixed p, the probability of seeing 14 heads and 6 tails is given by:

A common problem in machine learning is to find hypotheses that explain

(p)p1-p). Since our goal is to maximize this function, we minimize f(p)(p4 *(1-p)) Since the logarithm is monotone increasing, we can also consider the natural logarithm In(), that is, minimizing the function l(p) - - (14ln(p) + 6ln(1-p)) Use gradient descent to try and find a minimizing value of p. You may do this a. Write down the gradient (derivative) of the function you chose. (Hint: this is b. Try to get close to the minimum in 5 gradient descent steps. Use as your using either the -for the l function probably easier for l.) (3) initial guess p-1/2 (the coin is fair), and for the first 3 step sizes, use 0.04, 0.02, 0.01. The last 2 step sizes you can choose for yourself. For your answer fill in the table below. (10) ste 0 Step S1ze 1/2 Fill in0.02 Fill in 0.01 Fill inFill in Fill in Fill in 0.04 2 3 4 F Fill in (p)p1-p). Since our goal is to maximize this function, we minimize f(p)(p4 *(1-p)) Since the logarithm is monotone increasing, we can also consider the natural logarithm In(), that is, minimizing the function l(p) - - (14ln(p) + 6ln(1-p)) Use gradient descent to try and find a minimizing value of p. You may do this a. Write down the gradient (derivative) of the function you chose. (Hint: this is b. Try to get close to the minimum in 5 gradient descent steps. Use as your using either the -for the l function probably easier for l.) (3) initial guess p-1/2 (the coin is fair), and for the first 3 step sizes, use 0.04, 0.02, 0.01. The last 2 step sizes you can choose for yourself. For your answer fill in the table below. (10) ste 0 Step S1ze 1/2 Fill in0.02 Fill in 0.01 Fill inFill in Fill in Fill in 0.04 2 3 4 F Fill in

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!