Bagging with Decision Trees  Python Automation and Machine Learning for ICs   An Online Book  

Python Automation and Machine Learning for ICs http://www.globalsino.com/ICs/  


Chapter/Index: Introduction  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  Appendix  
================================================================================= A base learning algorithm (e.g., decision tree, neural network) is trained independently on each bootstrap sample. Therefore, each decision tree is likely to be slightly different due to the variations introduced by bootstrapping. One popular implementation of bagging with decision trees is the Random Forest algorithm. In Random Forest, each tree is trained on a random subset of features as well as a random subset of data, adding an extra layer of diversity to the ensemble. Figure 3739 shows bagging in decision trees. In this example, the "Original" dataset is generated with 100 data points. For each "Bootstrap Sample," the same number of data points as the original dataset is used. The bootstrap sampling is done with replacement, so some data points may be repeated in a particular bootstrap sample. In the current case, there are 5 bootstrap samples, and each bootstrap sample has 100 data points. Therefore, the total number of data points used across all bootstrap samples is 5 * 100 = 500. In the figure, each bootstrap sample is used to train a single decision tree. That is, there are 5 bootstrap samples, and each bootstrap sample is used to train one decision tree. Figure 3739. Bagging in decision trees (Code). ============================================


=================================================================================  

