gujiaqivadin

Jiaqi Gu gujiaqivadin

ZJU | Alibaba Cloud | 3D LLM | AI4Science

Achievements

Starred repositories

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,341 718 Updated Jun 28, 2025

zhaochen0110 / Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

404 19 Updated Jun 23, 2025

dvlab-research / VisionReasoner

The official implement of "VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning"

Python 190 12 Updated May 30, 2025

williamjsdavis / geo-lm

Geologic models from Llama 4 language model + GemPy!

Jupyter Notebook 54 18 Updated May 25, 2025

Ola-Omni / Ola

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 345 15 Updated Jun 13, 2025

yihedeng9 / OpenVLThinker

OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement

Python 91 5 Updated May 19, 2025

2U1 / Qwen2-VL-Finetune

An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.

Python 879 121 Updated Jun 25, 2025

manycore-research / SpatialLM

SpatialLM: Training Large Language Models for Structured Indoor Modeling

Python 3,427 256 Updated Jun 24, 2025

0russwest0 / Agent-R1

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 593 38 Updated May 27, 2025

LengSicong / MMR1

MMR1: Advancing the Frontiers of Multimodal Reasoning

161 5 Updated Mar 17, 2025

Video-R1 / Awesome-Multimodal-Reasoning

Collections of Papers and Projects for Multimodal Reasoning.

105 9 Updated Apr 25, 2025

Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks

A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.

64 7 Updated Mar 18, 2025

turningpoint-ai / VisualThinker-R1-Zero

Explore the Multimodal “Aha Moment” on 2B Model

Python 595 20 Updated Mar 18, 2025

Liuziyu77 / Visual-RFT

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,025 84 Updated Jun 26, 2025

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

937 43 Updated Jun 18, 2025

sheryc / arxiv-markdown-parser-plugin

Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.

JavaScript 78 9 Updated Mar 20, 2025

yuyq96 / R1-Vision

R1-Vision: Let's first take a look at the image

Python 47 1 Updated Feb 16, 2025

FanqingM / MM-Eureka-V0

MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka

Python 310 9 Updated Jun 21, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,912 2,313 Updated Jun 26, 2025

MoonshotAI / Kimi-k1.5

3,381 220 Updated Mar 7, 2025

Leezekun / MMSci

MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension

Python 45 Updated Dec 3, 2024

google-gemini / starter-applets

Google AI Studio Starter Apps

TypeScript 1,153 427 Updated Feb 4, 2025

Yeyuqqwx / HybridGS

This repository contains the code for the paper [HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting](https://gujiaqivadin.github.io/hybridgs/).

Python 49 1 Updated Dec 13, 2024