Bagging

❗️
This is a legacy Apache Ignite documentation
The new documentation is hosted here: https://ignite.apache.org/docs/latest/

Bagging stands for bootstrap aggregation. One way to reduce the variance of an estimate is to average together multiple estimates. For example, we can train M different trees on different subsets of the data (chosen randomly with replacement) and compute the ensemble:

Bagging uses bootstrap sampling to obtain the data subsets for training the base learners. For aggregating the outputs of base learners, bagging uses voting for classification and averaging for regression.

// Define the weak classifier.
DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(5, 0);

// Set up the bagging process.
BaggedTrainer<Double> baggedTrainer = TrainerTransformers.makeBagged(
  trainer, // Trainer for making bagged
  10,      // Size of ensemble
  0.6,     // Subsample ratio to whole dataset
  4,       // Feature vector dimensionality
  3,       // Feature subspace dimensionality
  new OnMajorityPredictionsAggregator())
  .withEnvironmentBuilder(LearningEnvironmentBuilder
                          .defaultBuilder()
                          .withRNGSeed(1)
                         );

// Train the Bagged Model.
BaggedModel mdl = baggedTrainer.fit(
  ignite,
  dataCache,
  vectorizer
);

👍
The most popular case is ...
A commonly used class of ensemble algorithms are forests of randomized trees.

Example

The full example could be found as a part of the Titanic tutorial here.