Question: Random Forest Motivation Ensemble learning is a general technique to combat overfitting, by combining the predictions of many varied models into a single prediction based
Random Forest Motivation Ensemble learning is a general technique to combat overfitting, by combining the predictions of many varied models into a single prediction based on their average or majority vote. (a) The motivation of averaging. Consider a set of uncorrelated random variables {Yi} n i=1 with mean and variance 2 . Calculate the expectation and variance of their average. (In the context of ensemble methods, these Yi's are analogous to the prediction made by classifier i.) (b) In part (a), we see that averaging reduces variance for uncorrelated classifiers. Real-world prediction will of course not be completely uncorrelated, but reducing correlation among decision trees will generally reduce the final variance. Reconsider a set of correlated random variables {Zi} n i=1 with mean and variance 2 , where each Zi R is a scalar. Suppose i , j, Corr(Zi , Zj) = . (If you don't remember the relationship between correlation and covariance from your prerequisite classes, please look it up.) Calculate the variance of the average of the random variables Zi , written in terms of , , and n. What happens when n gets very large, and what does that tell us about
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
