feat: add Multiple-Instance Learning (conjunctive) pooling option to MACE #1251
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enable Optional MIL Pooling in MACE (Multiple Instance Learning Extension)
This PR introduces Multiple Instance Learning (MIL) pooling as an optional graph-level readout module for MACE, using the Conjunctive Pooling operator from:
🚀 Motivation
MACE predicts atomic/total energies by summing atomic contributions.
For crystalline or periodic systems, this works well.
But for finite / defect-rich / amorphous / multi-region systems, some chemical effects:
MIL pooling complements MACE by learning:
Thus, MIL pooling acts as a residual graph-level regressor supporting MACE’s atomic baseline.
🧩 Method
MIL is added after the final interaction block.
Residual formulation:
Where:
✅ Performance Summary (Ag Dataset)
→ Large gain on global energy
→ Better force modeling via improved latent structure
🛠️ Usage
# Enable MIL pooling (recommended hyperparameters) --use_mil_pooling True \ --mil_d_attn 8 \ --mil_dropout 0.1 \ --mil_gamma_cap 0.2Disable at any time for perfect compatibility:
mil_d_attnmil_dropoutmil_gamma_capWhen to Use MACE + MIL Pooling
The MIL branch is most beneficial when target properties:
Recommended tasks include:
Nanoclusters & finite systems
Non-periodic boundaries and shape-dependent stability
Surfaces & interfaces
Chemical activity governed by exposed sites and coordination gradients
Amorphous / glassy / heterogeneous phases
Global disorder and medium-range correlations affect energy
Point & extended defects
Vacancies, interstitials, dislocations, grain boundaries
Catalysis & multi-component systems
Active site dominance and composition-dependent reactivity
In summary, MACE+MIL excels when energy differences arise from structural heterogeneity,
where purely local additive models struggle.
Limitations
Non-extensive correction path
MIL introduces a graph-level residual that partially relaxes strict extensivity.
If the target property is fully local and additive, MIL may overfit to noise.
Hyperparameter sensitivity
Performance depends on proper tuning of the MIL head width (
mil_d_attn),residual scaling (
mil_gamma_cap), and normalization strategy.Marginal gains in perfectly periodic crystals
When structural environments are highly homogeneous (ideal bulk), the MIL branch
provides little benefit and may slightly degrade extrapolation.
Compute and memory overhead
Although relatively small in our setting (+10% training time), overhead grows
with large graphs or long-range neighbor lists.
Interpretability should be used cautiously
Atom-level MIL attention highlights correlation, not causal influence;
attention maps may shift under distribution drift.
Additional testing needed for scale and robustness
Stability must be validated across different seeds, defects densities,
and dataset complexities before enabling by default.
📦 Code Status
Affected modules:
mace/modules/models.pymace/modules/mil_pooling.py(new)mace/tools/model_script_utils.py📌 Conclusion
This PR introduces a practical and interpretable global pooling capability to MACE.
It expands applicability toward structural heterogeneity while preserving the existing MACE design.
We recommend enabling MIL pooling when global geometry drives energy variability.
📚 Citation
Contributors (co-development)
Thanks everyone for the joint effort — feel free to add comments or approve!