manzhihuangnian

Follow

manzhihuangnian

Follow

2 followers · 20 following

Stars

modelscope / agentscope

Start building LLM-empowered multi-agent applications in an easier way.

Python 7,559 449 Updated Jul 1, 2025

jxzhangjhu / Awesome-LLM-RAG

Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models

1,235 73 Updated Feb 24, 2025

e2b-dev / awesome-ai-agents

A list of AI autonomous agents

19,086 1,469 Updated Feb 26, 2025

kyrolabs / awesome-agents

🤖 Awesome list of AI Agents

852 96 Updated Jun 23, 2025

Fancy-MLLM / R1-Onevision

R1-onevision, a visual language model capable of deep CoT reasoning.

Python 535 14 Updated Apr 13, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,309 6,536 Updated Jul 1, 2025

datawhalechina / self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

Jupyter Notebook 20,045 2,057 Updated Jul 2, 2025

datawhalechina / happy-llm

📚 从零开始的大语言模型原理与实践教程

5,609 395 Updated Jun 28, 2025

NVlabs / SegFormer

Official PyTorch implementation of SegFormer

Python 2,959 387 Updated Aug 2, 2024

langgenius / dify

Production-ready platform for agentic workflow development.

TypeScript 105,387 15,907 Updated Jul 2, 2025

infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 58,721 5,810 Updated Jul 2, 2025

thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.

Python 8,604 1,163 Updated Jun 27, 2025

huangjia2019 / rag-in-action

极客时间RAG训练营，RAG 10大组件全面拆解，4个实操项目吃透 RAG 全流程。正如刘焕勇老师所言：RAG的落地，往往是面向业务做RAG，而不是反过来面向RAG做业务。这就是为什么我们需要针对不同场景、不同问题做针对性的调整、优化和定制化。魔鬼全在细节中，我们深入进去探究。

Jupyter Notebook 278 135 Updated Jul 1, 2025

datawhalechina / easy-rl

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 11,747 2,054 Updated Jun 19, 2025

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,005 48 Updated Jul 1, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,366 160 Updated Mar 20, 2025

openai / prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 2,017 119 Updated Jun 1, 2023

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 41,391 3,297 Updated Jul 1, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 22,301 1,509 Updated Jun 26, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,653 272 Updated Apr 10, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,205 8,443 Updated Jul 2, 2025

LengSicong / MMR1

MMR1: Advancing the Frontiers of Multimodal Reasoning

161 5 Updated Mar 17, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,857 218 Updated Jun 30, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,320 63 Updated Feb 8, 2025

Deep-Agent / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,814 289 Updated May 19, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,935 2,316 Updated Jul 1, 2025

jingyi0000 / R1-VL

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Python 401 Updated Jun 30, 2025

turningpoint-ai / VisualThinker-R1-Zero

Explore the Multimodal “Aha Moment” on 2B Model

Python 596 21 Updated Mar 18, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 7,236 702 Updated Jun 19, 2025

ModalMinds / MM-EUREKA

MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Python 672 24 Updated Jun 25, 2025