Question: Consider two coins, and write the random variable for the payoff from each coin as x ( 1 ) and x ( 2 ) .

Consider two coins, and write the random variable for the payoff from each
coin as x(1) and x(2). The ground truth distribution for each coin is P(x(1)=0)=0.2,P(x(1)=1)=0.8,
P(x(2)=0)=0.6 with P(x(2)=1)=0.4. Plot the total reward of T rounds of play
J(T)=i=1Tx(ci)
where ciin{1,2} is the coin choice, as the number of plays T goes from 10 to 1000 for each of the following strategies.
Keep in mind that J(T) is a random variable. The x-axis should be T and y-axis is J(T), and plot everything on the
same graph so that the curves can be compared.
Explore-then-commit with is the ceiling function that gives you an integer).
Explore-then-commit with N=|~12T23(logT)13~|, where log is the natural logarithm.
The -Greedy strategy with =0.2.(With 1- probability play the currently-best coin, and with probability
the other).
The Upper Confidence Bound strategy, which plays the coin iin{1,2} that maximizes x(i)+2logTN(i)2 in each
round T.
Design some new plots to show the regret of these strategies and explain.
 Consider two coins, and write the random variable for the payoff

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!