Question: 7.2 Alternative objective functions. This problem studies boosting-type algorithms de ned with objective functions di erent from that of AdaBoost. We assume that the training
7.2 Alternative objective functions. This problem studies boosting-type algorithms dened with objective functions dierent from that of AdaBoost. We assume that the training data are given as m labeled examples (x1; y1); : : : ; (xm; ym) 2 X f????1; +1g. We further assume that is a strictly increasing convex and dierentiable function over R such that: 8x 0; (x) 1 and 8x < 0; (x) > 0.
(a) Consider the loss function L() =
Pm i=1 (????yif(xi)) where f is a linear combination of base classiers, i.e., f =
PT t=1 tht (as in AdaBoost). Derive a new boosting algorithm using the objective function L. In particular, characterize the best base classier hu to select at each round of boosting if we use coordinate descent.
(b) Consider the following functions: (1) zero-one loss 1(????u) = 1u0; (2) least squared loss 2(????u) = (1 ???? u)2; (3) SVM loss 3(????u) = maxf0; 1 ???? ug;
and (4) logistic loss 4(????u) = log(1 + e????u). Which functions satisfy the assumptions on stated earlier in this problem?
(c) For each loss function satisfying these assumptions, derive the corresponding boosting algorithm. How do the algorithm(s) dier from AdaBoost?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
