Boosting Concepts and Ensemble Examples in Machine Learning

ensembles part 2 boosting n.w
1 / 10
Embed
Share

Learn about boosting concepts, ensemble examples, and approaches in machine learning to improve model accuracy, mitigate bias, and manage variance challenges. Explore how ensembles combine multiple models for enhanced prediction capabilities through different strategies like bagging, boosting, and random forests.

  • Boosting Concepts
  • Ensemble Examples
  • Machine Learning
  • Bias-Variance
  • Model Accuracy

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Ensembles Part 2 Boosting Geoff Hulten

  2. Ensemble Overview Bias / Variance challenges Simple models can t represent complex concepts (bias) Complex models can overfit noise and small deltas in data (variance) Instead of learning one model, learn several (many) and combine them Easy (low risk) way to mitigate some bias / variance problems Often results in better accuracy, sometimes significantly better

  3. Approaches to Ensembles 1) Learn the concept many different ways and combine Prefers higher variance, relatively low bias base models Models are independent Robust vs overfitting Techniques: Bagging, Random Forests, Stacking 2) Learn different parts of the concept with different models and combine Can work with high bias base models (weak learners) and high variance Each model depends on previous models May overfit Techniques: Boosting, Gradient Boosting Machines (GBM)

  4. Another Ensemble Example High Bias; Low Variance Accuracy 92.7% Accuracy 91.4% Accuracy 92.1% Accuracy 92.4% Average 10 Linear Models Accuracy 92.5% Target Concept Linear Models Accuracy 92.3% Accuracy 92.1% Average 100 Linear Models Accuracy 92.1% Accuracy 92.6% Accuracy 92.4% Accuracy 92.1% Average 500 Linear Models 100 samples per model, iterations=10k, step=0.05

  5. Boosting Concept Concept we re trying to learn Training Data ? = ?(?) < ?,? > ? = 1? Model we learn on data ? ? ???????? < ?,? 1(?) > ??= 2? ? = 1? + 2? Keep going! The part of the concept didn t learn Training data to learn the residual 2 is a model to predict the residuals At run time combine 1and 2for a better estimate of y

  6. Each model trained on mistakes (residuals) from previous runs Boosting Sketch Models must be trained in sequence (parallel not possible) Reweighted Training Set Training Set Linear Model Linear Model Reweighted Training Set Reweighted Training Set Linear Model Later models target hard samples and noise (overfit) Conceptual Boosting Learn Base Model Reweight training data so mistakes get more focus Learn Another Base Model Stop when starting to overfit (holdout set?) Final Answer is weighted vote of all models Linear Model Ensemble Prediction

  7. Reweighting Training Data ? ?????(????????) ??? =1 Logistic Regression: ^ ?? ??? ? ?=1 ?? labels features weights ? ?????(????????) ??? =1 ^ ?? ??? ? ?=1 ???? < ?1,?1,?1> < ?2,?2,?2> < ? ,? ,? > < ??,??,??> ? 1 ? ?=1 Decision Trees: ???? ? = ???????(???? ??) Similar in entropy calculation ? 1 ? ?=1 ???? ? = ?? ???????(???? ??)

  8. AdaBoost Algorithm ??is error of model ? Normalize training set weight (sum to 1) 1 0.8 Samples start with uniform weight 0.6 ?? 0.4 0.2 0 0 0.2 ?? 0.4 r: round k: numModels Sample weight multiplied by: 1 if wrong ??if correct Learn model using weightings ?? ??is weight of incorrect training samples Final ensemble, each model votes based on accuracy High training error means residuals too hard (noisy) to make progress so stop 2 1.5 ??) log(1 1 0.5 0 0 0.5 ?? 1

  9. Example of boosting ?1 ?2 ?2 ?3 ?3 ? 1? = ?? 2? = ?? < ?1,?1> < ?2,?2> < ?3,?3> < ?4,?4> .25 1 .08 .166 1 .026 .045 2 .25 0 .25 .50 0 .50 .86 1 .25 1 .08 .166 1 .026 .045 .25 1 .08 .166 1 .026 .045 ?1= .25 ?1= .33 ?2= .25 ?2= .33 All ? = 1 E2 should be weighted? ???? = 1 ?? (log1 ?1 1???? + log1 ?2 2(????)) > .5 ???? 0 2 1.5 ??) log(1 1 0.5 0 0 0.5 ?? 1

  10. Summary Boosting is an ensemble technique where each base model learns a part of the concept Boosting can help with bias or variance issues, but can overfit need to control the search A form of boosting Gradient Boosting Machines (GBM) is currently used when chasing high accuracy in practice Boosting can allow weak (high bias) learners to learn complicated concepts

More Related Content