ORLax is an extensible, research-friendly offline reinforcement learning framework built with JAX, Flax, and Optax. It provides clean, typed APIs optimized for editor autocompletion, modular algorithm implementations, and production-ready features like WandB logging and GPU acceleration.
- 🔥 Modern JAX Stack: Built on JAX, Flax, and Optax for high-performance GPU/TPU training
- 📦 Modular Design: Clean separation of concerns with pluggable algorithms, models, and datasets
- 🎯 Type-Safe: Comprehensive type hints with dataclasses instead of dict-heavy patterns
- 📊 Built-in Logging: WandB integration with terminal progress bars (tqdm)
- 🚀 Production-Ready: Checkpointing, multi-device training, and reproducible experiments
- 🧪 Research-Friendly: Clear interfaces, and easy extensibility
- BC (Behavioral Cloning) - Supervised learning from expert demonstrations
- CQL (Conservative Q-Learning) - Conservative offline RL with Q-value penalties
- IQL (Implicit Q-Learning) - Expectile regression-based offline RL
# Clone the repository
git clone https://github.com/sql-hkr/orlax.git
cd orlax
# Install with uv
uv sync# Clone the repository
git clone https://github.com/sql-hkr/orlax.git
cd orlax
# Install in editable mode
pip install -e .For CUDA support, install JAX with CUDA:
# For CUDA 12
pip install --upgrade "jax[cuda12]"# Train IQL on Hopper-Medium
uv run orlax-train --config configs/iql_hopper.tomlIf you use ORLax in your research, please cite:
@software{orlax2025,
title = {ORLax: Offline Reinforcement Learning with JAX},
author = {sql-hkr},
year = {2025},
url = {https://github.com/sql-hkr/orlax}
}Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feat/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feat/amazing-feature) - Open a Pull Request
- Author: sql-hkr
- Email: [email protected]
- GitHub: @sql-hkr
- Issues: GitHub Issues
Note: This software is under active development. API stability is not guaranteed until version 1.0.0.