Question: Q 5 . Online prediction with combinatorial expert sets ( advanced ) . ( 2 5 points ) Q 5 . Online prediction with combinatorial

Q5. Online prediction with combinatorial expert sets (advanced).(25 points)
Q5. Online prediction with combinatorial expert sets (advanced).(25 points)
In this problem, we will explore the computational efficiency of online algorithms applied
to the problem of contextual prediction, i.e. prediction with side information. Consider
predicting a binary sequence y1,dots,yTin{0,1} with side information available in the form
of a d-length bit string, i.e.zt=(zt,1,dots,zt,d)in{0,1}d, before round t. Let the n experts
denote the set of all Boolean functions that map context zt to a prediction xt, i.e. expert
f is given by Boolean function f:{0,1}d{0,1} and expert f will predict f(zt). Also
denote the loss function of expert f by lf(t)=I[f(zt)yt]. Let F denote the set of all such
Boolean functions.
(a)(5 points) Show that the computational efficiency of the randomized weighted majority
algorithm is at most linear in the number of experts per iteration, i.e. if the number
of experts is equal to n, the computational complexity per iteration is O(n).
(b)(5 points) Show that the number of experts n=22d in this example, implying pro-
hibitive computational complexity.
(c)(5 points) We will now show a better complexity bound for the special case of the
exponential weights algorithm. Recall that the exponential weights update is given by
wf(t+1)=wf(t)*exp(-lonlf(t))=exp(-lonLf(t))
for any of the experts (Boolean functions)f, where we defined Lf(t):=s=1tlf(t)(i.e.
the cumulative loss). We will show that we can implement the probability function
pt=P[xt=1] efficiently in this case. Without loss of generality, consider zt=1:=
(1,1,dots,1), i.e. the all-ones bitstring. Denote F1 to be the set of all Boolean functions
that predicts f(zt)=1(and F0 to be the set of all Boolean functions that predicts
f(zt)=0). Note that F=F0F1 and that F0F1=O?. Write an expression for pt
in terms of sums over F1,F0,lon, and the cumulative losses {Lf(t)}finF.
(d)(5 points) For any zin{0,1}d, denote Cx,z(t):=s-1t,z--zI[ysx]. Show that
exp(-lonLf(t))=prodzlon{0,1}4exp(-lonCf(z),z(t)).
(e)(5 points) Show that
finF1?prodin{0,1)d:z1exp(-lonCf(z),z(t))=finF0?prodxin(0,1)4:z1exp(-lonCf(z),z(t))
Plug this into your expression for pt to show that we can write
pt=exp(-lonC1,1(t))exp(-lonC1,1(t))+exp(-lonC0,1(t))
What is the computational complexity per iteration of this update? What about
memory (i.e. storage) complexity?
Q 5 . Online prediction with combinatorial expert

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!