Q 5 Online prediction with combinatorial expert sets ( advanced ) ( 2 5 points ) Q 5 Online prediction with combinatorial expert sets ( advanced ) ( 2 5 points ) In this problem, we will explore the computational efficiency of online algorithms applied to the problem of contextual prediction, i e prediction with side information Consider predicting a binary sequence y 1 , dots, y T i n 0 , 1 with side information available in the form of a d length bit string, i e z t ( z t , 1 , dots, z t , d ) i n 0 , 1 d , before round t Let the n experts denote the set of all Boolean functions that map context z t to a prediction x t , i e expert f is given by Boolean function f 0 , 1 d 0 , 1 and expert f will predict f ( z t ) Also denote the loss function of expert f by l f ( t ) I f ( z t ) y t Let F denote the set of all such Boolean functions ( a ) ( 5 points ) Show that the computational efficiency of the randomized weighted majority algorithm is at most linear in the number of experts per iteration, i e if the number of experts is equal to n , the computational complexity per iteration is O ( n ) ( b ) ( 5 points ) Show that the number of experts n 2 2 d in this example, implying pro hibitive computational complexity ( c ) ( 5 points ) We will now show a better complexity bound for the special case of the exponential weights algorithm Recall that the exponential weights update is given by w f ( t 1 ) w f ( t ) e x p ( l o n l f ( t ) ) exp ( l o n L f ( t ) ) for any of the experts ( Boolean functions ) f , where we defined L f ( t ) s 1 t l f ( t ) ( i e the cumulative loss ) We will show that we can implement the probability function p t P x t 1 efficiently in this case Without loss of generality, consider z t 1 ( 1 , 1 , dots, 1 ) , i e the all ones bitstring Denote F 1 to be the set of all Boolean functions that predicts f ( z t ) 1 ( and F 0 to be the set of all Boolean functions that predicts f ( z t ) 0 ) Note that F F 0 F 1 and that F 0 F 1 O Write an expression for p t in terms of sums over F 1 , F 0 , l o n , and the cumulative losses L f ( t ) f i n F ( d ) ( 5 points ) For any zin 0 , 1 d , denote C x , z ( t ) s 1 t , z z I y s x Show that exp ( l o n L f ( t ) ) p r o d z l o n 0 , 1 4 exp ( l o n C f ( z ) , z ( t ) ) ( e ) ( 5 points ) Show that f i n F 1 p r o d i n 0 , 1 ) d z 1 exp ( l o n C f ( z ) , z ( t ) ) f i n F 0 p r o d x i n ( 0 , 1 ) 4 z 1 exp ( l o n C f ( z ) , z ( t ) ) Plug this into your expression for p t to show that we can write p t e x p ( l o n C 1 , 1 ( t ) ) e x p ( l o n C 1 , 1 ( t ) ) e x p ( l o n C 0 , 1 ( t ) ) What is the computational complexity per iteration of this update What about memory ( i e storage ) complexity

The Answer is in the image, click to view ...

Question: Q 5 . Online prediction with combinatorial expert sets ( advanced ) . ( 2 5 points ) Q 5 . Online prediction with combinatorial

5 .

Online prediction with combinatorial expert sets

(

advanced

) . (25

points

)

5 .

Online prediction with combinatorial expert sets

(

advanced

) . (25

points

)

In this problem, we will explore the computational efficiency of online algorithms applied

to the problem of contextual prediction, i

.

.

prediction with side information. Consider

predicting a binary sequence

y_{1},

dots,

y_{T} i n {0, 1}

with side information available in the form

of a

d -

length bit string, i

.

. z_{t} = (z_{t, 1},

dots,

z_{t, d}) i n {0, 1}^{d},

before round

t .

Let the

n

experts

denote the set of all Boolean functions that map context

z_{t}

to a prediction

x_{t},

.

.

expert

f

is given by Boolean function

f

{0, 1}^{d} {0, 1}

and expert

f

will predict

f (z_{t}) .

Also

denote the loss function of expert

f

l_{f}^{(t)} = I [f (z_{t}) y_{t}] .

Let

F

denote the set of all such

Boolean functions.

(

) (5

points

)

Show that the computational efficiency of the randomized weighted majority

algorithm is at most linear in the number of experts per iteration, i

.

.

if the number

of experts is equal to

n,

the computational complexity per iteration is

O (n) .

(

) (5

points

)

Show that the number of experts

n = 2^{2 d}

in this example, implying pro

-

hibitive computational complexity.

(

) (5

points

)

We will now show a better complexity bound for the special case of the

exponential weights algorithm. Recall that the exponential weights update is given by

w_{f}^{(t + 1)} = w_{f}^{(t)} * e x p (- l o n l_{f}^{(t)}) =

exp

(- l o n L_{f}^{(t)})

for any of the experts

(

Boolean functions

) f,

where we defined

L_{f}^{(t)}

=_{s = 1}^{t} l_{f}^{(t)} (

.

.

the cumulative loss

) .

We will show that we can implement the probability function

p_{t} = P [x_{t} = 1]

efficiently in this case. Without loss of generality, consider

z_{t} = 1

=

(1, 1,

dots,

1),

.

.

the all

-

ones bitstring. Denote

F_{1}

to be the set of all Boolean functions

that predicts

f (z_{t}) = 1 (

and

F_{0}

to be the set of all Boolean functions that predicts

f (z_{t}) = 0) .

Note that

F = F_{0} F_{1}

and that

F_{0} F_{1} = \frac{O}{?} .

Write an expression for

p_{t}

in terms of sums over

F_{1}, F_{0}, l o n,

and the cumulative losses

{L_{f}^{(t)}}_{f i n F} .

(

) (5

points

)

For any zin

{0, 1}^{d},

denote

C_{x, z}^{(t)}

=_{s - 1}^{t}, z_{-} - z I [y_{s} x] .

Show that

exp

(- l o n L_{f}^{(t)}) = p r o d_{z l o n {0, 1}^{4}}

exp

(- l o n C_{f (z), z}^{(t)}) .

(

) (5

points

)

Show that

_{f i n F_{1}}^{?} p r o d_{i n {0, 1)^{d} : z} 1

exp

(- l o n C_{f (z), z}^{(t)}) =_{f i n F_{0}}^{?} p r o d_{x i n (0, 1)^{4 : z}} 1

exp

(- l o n C_{f (z), z}^{(t)})

Plug this into your expression for

p_{t}

to show that we can write

p_{t} = \frac{e x p (- l o n C_{1, 1}^{(t)})}{e x p (- l o n C_{1, 1}^{(t)}) + e x p (- l o n C_{0, 1}^{(t)})}

What is the computational complexity per iteration of this update? What about

memory

(

.

.

storage

)

complexity?

Q 5 . Online prediction with combinatorial expert

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

The rate at which the capacitor in a series RC circuit charges or discharges depends on the capacitance C of the capacitor and resistance R in the circuit. We define a time constant, T, as the time...

(a) Explain how a limit worth may be tended to at run-time, both in a syntaxtree middle person and in collected code. What is the importance word "settled" already? [3 marks] (b) Give a model program...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Journal of Information Technology Education Volume 6, 2007 The Delphi Method for Graduate Research Gregory J. Skulmoski Zayed University, Dubai, United Arab Emirates Francis T. Hartman and Jennifer...

Need help with the document attached. The company that this paper is about is Nvidia. Project Descriptions SEC 10-K Paper You will be asked to select a company that is publically traded. You must...

6.4 - Discussion: Investments No unread replies. No replies. Financial Statements AnalysisDirections Using the Annual Report of your selected company answer the following questions in the Discussion:...

6.4 - Discussion: Investments Financial Statements AnalysisDirections Using the Annual Report of your selected company answer the following questions in the Discussion: What depreciation method does...

5.5 - Discussion: Current Assets 1 Financial Statements AnalysisDirections Using the Annual Report of your selected company (WALMART) answer the following questions in the Discussion: What is the...

Using the Annual Report of your selected company (WALMART), answer the following questions in the Discussion: What is the value of the company's inventory at year end? What was the amount of cost of...

5.5 - Discussion: Current Assets 1 Financial Statements AnalysisDirections Using the Annual Report of your selected company (WALMART) answer the following questions in the Discussion: What is the...

A multiproduct firms cost function was recently estimated as C(Q1, Q2) = 75 0.25 Q1 Q2 + 0.1Q21 + 0.2Q22 a. Are there economies of scope in producing 10 units of product 1 and 10 units of product 2?...

Consider the following portfolio. Compute the portfolio's duration Bond Market Value A $15 million $20 million B C D $35 million $30 million Duration (in years) 3 5 4 8

The Securities Investor Protection Corporation protects individuals from Group of answer choices making poor investment decisions fraud by corporations other investors who fail to make delivery...

According to the CDC, policy analysis occurs after ___ and before ___. A) strategy and policy development; problem identification B) strategy and policy development; policy enactment C) policy enactme