Question: Problem 6 (ADABOOST-TYPE ALGORITHMS: ALTERNATIVE OBJECTIVE FUNCTIONS) This problem studies boosting-type algorithms defined with objective functions different from that of AdaBoost. We assume that the

Problem 6 (ADABOOST-TYPE ALGORITHMS: ALTERNATIVE OBJECTIVE FUNCTIONS) This problem studies boosting-type

Problem 6 (ADABOOST-TYPE ALGORITHMS: ALTERNATIVE OBJECTIVE FUNCTIONS) This problem studies boosting-type algorithms defined with objective functions different from that of AdaBoost. We assume that the training data are given as m labeled examples (I1, y),..., (I'm, Ym) E Xx{-1, +1}. We further assume that is a strictly increasing convex and differentiable function over R such that: Vx > 0, 0(0) > 1 and Vx 0. m i=1 = (a) Consider the loss function L(a) = 0(-yif (xi)) where f is a linear combination of base classifiers, i.e., f = CF-1 otht (as in AdaBoost). Derive a new boosting algorithm using the objective function L. In particular, characterize the best base classifier hy to select at each round of boosting if we use coordinate descent. (b) Consider the following functions: (1) zero-one loss 1(-u) = luso ; (2) least squared loss 02(-u) = (1 u)?; (3) SVM loss 03(-u) max(0,1 u); and (4) logistic loss (4(-u) = log2 (1+e-u). Which functions satisfy the assumptions on stated earlier in this problem? (c) For each loss function satisfying these assumptions, derive the corresponding boosting algo- rithm. How do the algorithm(s) differ from AdaBoost? (d) Noise-tolerant Adaboost: AdaBoost may be significantly over-fitting in the presence of noise, in part due to the high penalization of misclassified examples. To reduce this effect, we use the following loss function, e- (-) ={ if u > 0 -u+1 othewise, prove that this function also satisfies the assumptions on stated earlier in this problem, and derive the corresponding boosting algorithm. Compare the reduction of the empirical error rate of this algorithm with that of AdaBoost. Problem 6 (ADABOOST-TYPE ALGORITHMS: ALTERNATIVE OBJECTIVE FUNCTIONS) This problem studies boosting-type algorithms defined with objective functions different from that of AdaBoost. We assume that the training data are given as m labeled examples (I1, y),..., (I'm, Ym) E Xx{-1, +1}. We further assume that is a strictly increasing convex and differentiable function over R such that: Vx > 0, 0(0) > 1 and Vx 0. m i=1 = (a) Consider the loss function L(a) = 0(-yif (xi)) where f is a linear combination of base classifiers, i.e., f = CF-1 otht (as in AdaBoost). Derive a new boosting algorithm using the objective function L. In particular, characterize the best base classifier hy to select at each round of boosting if we use coordinate descent. (b) Consider the following functions: (1) zero-one loss 1(-u) = luso ; (2) least squared loss 02(-u) = (1 u)?; (3) SVM loss 03(-u) max(0,1 u); and (4) logistic loss (4(-u) = log2 (1+e-u). Which functions satisfy the assumptions on stated earlier in this problem? (c) For each loss function satisfying these assumptions, derive the corresponding boosting algo- rithm. How do the algorithm(s) differ from AdaBoost? (d) Noise-tolerant Adaboost: AdaBoost may be significantly over-fitting in the presence of noise, in part due to the high penalization of misclassified examples. To reduce this effect, we use the following loss function, e- (-) ={ if u > 0 -u+1 othewise, prove that this function also satisfies the assumptions on stated earlier in this problem, and derive the corresponding boosting algorithm. Compare the reduction of the empirical error rate of this algorithm with that of AdaBoost

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

7.2 Alternative objective functions. This problem studies boosting-type algorithms de ned with objective functions di erent from that of AdaBoost. We assume that the training data are given as m...

Describe the four access regimes from public to private that may be applied to Java fields and methods. Why are they useful? [4 marks] (b) When you extend a class, the constructor for your new class...

Attached is Accounting assignment along side recommended readings to answer certain questions. Thank you Assignment 1 Problem 1 15 points Reading - W. L. Ferrara, Cost/Management Accounting: The 21st...

Please help me make an Executive Summary. Explain what you will examine in the case study. Write an overview of the field you are researching. Make a thesis statement and sum up the results of your...

I want you to summerize these 7 items. !Please be different from other answers! !Please get a little quick! Thanks. You should summarize the 7 items in the photo. Max 125 words! reading 1....

Could you please explain the findings of the study? A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models Evangelia...

Confirming Pages C H A P T E R 19 Analyzing Information and Writing Reports Chapter Outline Using Your Time Efficiently Analyzing Data and Information for Reports Identifying the Source of the Data...

A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and...

Let r and s be solutions to the quadratic equation x 2 b x + c = 0. For n N, define d0 = 0 d1 = r s dn = b dn1 c dn2 (n 2) Prove that dn = r n s n for all n N. [4 marks] (b) Recall that a commutative...

Responsibility Center Presentation Imagine you have been selected by your manager to present a training session to a group of new employees. The new hires do not have accounting backgrounds and have...

On July 31, Konrad International had $125,900 of accounts receivable. Prepare journal entries to record the following August transactions. Also, prepare any footnotes to the August 31 financial...

The five steps in the revenue recognition process are: 1.Identify the contract(s) with customers. 2.Identify the separate performance obligations in the contract. 3.Determine the transaction price....

When planning to minimize mutual fund costs, which investment would be preferred by a very long - term buy - hold investor? 1 percent front - end load; 0 . 3 0 percent expense ratio No - load, 0 . 8...

Kaiven Company accepted a $ 1 2 , 0 0 0 , 6 0 - day, 6 % note on December 2 1 from Diaz Co , granting a time extension on his past - due account receivable. The adjusting entry on December 3 1 would...

1. Where do these biases come from?

5. What decisions would a city manager at the postconventional level be expected to make?

7. What decisions would you make as the city manager?