Orion-BiX is an advanced tabular foundation model that combines Bi-Axial Attention with Meta-Learning capabilities for few-shot tabular classification. The model extends the TabICL architecture with alternating attention patterns and episode-based training, achieving state-of-the-art performance on domain-specific benchamrks such as Healthcare and Finance.
Orion-BiX introduces three key architectural innovations:
- Bi-Axial Attention: Alternating attention patterns (Standard → Grouped → Hierarchical → Relational) that capture multi-scale feature interactions
- Meta-Learning: Episode-based training with k-NN support selection for few-shot learning
- Configurable Architecture: Flexible design supporting various attention mechanisms and training modes
- Production Ready: Memory optimization, distributed training support, and scikit-learn interface
Orion-BiX follows a three-component architecture:
Input → Column Embedder (Set Transformer) → Bi-Axial Attention → ICL Predictor → Output
- Column Embedder: Set Transformer for statistical distribution learning across features from TabICL
- Bi-Axial Attention: Replaces standard RowInteraction with alternating attention patterns:
- Standard Cross-Feature Attention: Direct attention between features
- Grouped Feature Attention: Attention within feature groups
- Hierarchical Feature Attention: Hierarchical feature patterns
- Relational Feature Attention: Full feature-to-feature attention
- CLS Token Aggregation: Multiple CLS tokens (default: 4) for feature summarization
- tf_icl ICL Predictor: In-context learning module for few-shot prediction
Each BiAxialAttentionBlock applies four attention patterns in sequence:
Standard → Grouped → Hierarchical → Relational → CLS Aggregation
- Python 3.9-3.12
- PyTorch 2.2+ (with CUDA support recommended)
- CUDA-capable GPU (recommended for training)
cd orion-bix
pip install -e .pip install git+https://github.com/Lexsi-Labs/Orion-BiX.gitOrion-BiX provides a scikit-learn compatible interface for easy integration:
from orion_bix.sklearn import OrionBixClassifier
# Initialize and fit the classifier
clf = OrionBixClassifier()
# Fit the model (prepares data transformations)
clf.fit(X_train, y_train)
# Make predictions
predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)Orion-BiX includes automatic preprocessing that handles:
- Categorical Encoding: Automatically encodes categorical features using ordinal encoding
- Missing Value Imputation: Handles missing values using median imputation for numerical features
- Feature Normalization: Supports multiple normalization methods:
"none": No normalization"power": Yeo-Johnson power transform"quantile": Quantile transformation to normal distribution"quantile_rtdl": RTDL-style quantile transform"robust": Robust scaling using median and quantiles
- Outlier Handling: Clips outliers beyond a specified Z-score threshold (default: 4.0)
- Feature Permutation: Applies systematic feature shuffling for ensemble diversity:
"none": Original feature order"shift": Circular shifting"random": Random permutation"latin": Latin square patterns (recommended)
The preprocessing is automatically applied during fit() and predict(), so no manual preprocessing is required.
If you use Orion-BiX in your research, please cite our paper:
@article{bouadi2025orionbix,
title={Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning},
author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
year={2025},
eprint={2512.00181},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2512.00181},
}This project is released under the MIT License. See LICENSE for details.
For questions, issues, or contributions, please:
Orion-BiX is built on top of TabICL, a tabular foundation model for in-context learning. We gratefully acknowledge the TabICL authors for their foundational work and for making their codebase publicly available.




