Starred repositories
🦛 CHONK docs with Chonkie ✨ — The no-nonsense RAG library
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
🧩 Modular, composable API views for scalable Django Ninja projects, with built-in CRUD.
Open-source framework for uncertainty and deep learning models in PyTorch 🌱
Advanced choice modeling with multidimensional utility representations.
Specificy, execute and monitor performances of active learning pipelines.
Scoring Lists – a probabilistic & incremental extension to Scoring Systems
Pairwise Difference Learning (PDL) is a meta-learning framework that leverages pairwise differences to transform multiclass problems into binary tasks. This repository includes the original PDL Cla…
A fast implementation of fair allocation algorithms for indivisible items.
An open-source library of fair division algorithms in Python
Shapley Interactions and Shapley Values for Machine Learning
Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools, with 10x faster training through evolutionary hyperparameter optimization.
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Semantic search engine indexing 110 million academic publications
A Python library that helps data scientists to infer causation rather than observing correlation.
Fast and incremental explanations for online machine learning models. Works best with the river framework.
System for Medical Concept Extraction and Linking
SQL databases in Python, designed for simplicity, compatibility, and robustness.
A playbook for systematically maximizing the performance of deep learning models.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
You like pytorch? You like micrograd? You love tinygrad! ❤️
Code for the paper "Towards Reliable Simulation-Based Inference with Balanced Neural Ratio Estimation".
The PyExperimenter is a tool for the automatic execution of experiments, e.g. for machine learning (ML), capturing corresponding results in a unified manner in a database.





