Stars
Start building LLM-empowered multi-agent applications in an easier way.
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
R1-onevision, a visual language model capable of deep CoT reasoning.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Official PyTorch implementation of SegFormer
Production-ready platform for agentic workflow development.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
An elegant PyTorch deep reinforcement learning library.
极客时间RAG训练营,RAG 10大组件全面拆解,4个实操项目吃透 RAG 全流程。正如刘焕勇老师所言:RAG的落地,往往是面向业务做RAG,而不是反过来面向RAG做业务。这就是为什么我们需要针对不同场景、不同问题做针对性的调整、优化和定制化。魔鬼全在细节中,我们深入进去探究。
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
Understanding R1-Zero-Like Training: A Critical Perspective
800,000 step-level correctness labels on LLM solutions to MATH problems
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
A high-throughput and memory-efficient inference and serving engine for LLMs
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
A fork to add multimodal model training to open-r1
Witness the aha moment of VLM with less than $3.
Fully open reproduction of DeepSeek-R1
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Explore the Multimodal “Aha Moment” on 2B Model
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning