Why 100 Mediocre Trees Beat One Brilliant One
Random Forests achieve superior accuracy by averaging many diverse, imperfect decision trees—mirroring how 800 crowd guesses for an ox's weight hit within 1% of truth.
Crowd Wisdom Drives Random Forest Accuracy
In 1906, Francis Galton observed a fair where 800 non-experts guessed an ox's weight. No individual was correct, but averaging their estimates yielded 1,207 pounds against the true 1,198 pounds—a 1% error, outperforming any single guess. This 'wisdom of crowds' principle underpins Random Forests: deliberately introducing randomness creates diverse decision trees, each mediocre alone but collectively robust as their uncorrelated errors cancel out.
Randomness as Engineering Choice
The 'Random' in Random Forest isn't haphazard—it's engineered to replicate crowd diversity. Unlike a single 'brilliant' tree prone to overfitting specific data quirks, ensembles of 100+ randomized trees (via bootstrapped samples and random feature subsets at splits) aggregate to reliable predictions. This counterintuitive approach—favoring quantity of imperfect models over perfection—forms one of machine learning's most practical ideas for regression and classification tasks.