feymanpriv

yangmin09 feymanpriv

46 followers · 27 following

BUPT
Beijing

Achievements

Stars

luca-medeiros / lang-segment-anything

SAM with text prompt

Python 2,452 284 Updated Aug 28, 2025

HJYao00 / R1-ShareVL

[NeurIPS 2025] Reasoning MLLM, Share-GRPO, advantage vanishing, sparse reward

Python 26 1 Updated Sep 19, 2025

lupantech / AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization

Python 1,206 145 Updated Nov 5, 2025

ShaohonChen / Qwen3-SmVL

将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调

Python 423 44 Updated Sep 8, 2025

agiresearch / OpenAGI

OpenAGI: When LLM Meets Domain Experts

Python 2,205 200 Updated Nov 28, 2024

zhaochen0110 / Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,095 37 Updated Oct 4, 2025

EvolvingLMMs-Lab / multimodal-search-r1

MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.

Python 347 17 Updated Aug 26, 2025

google-gemini / gemini-cli

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 81,967 9,167 Updated Nov 9, 2025

ConardLi / easy-dataset

A powerful tool for creating fine-tuning datasets for LLM

JavaScript 11,720 1,133 Updated Nov 8, 2025

FanqingM / MM-Eureka-V0

MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka

Python 320 10 Updated Jun 21, 2025

VITA-MLLM / VITA

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,441 177 Updated Mar 28, 2025

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,330 540 Updated Nov 8, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,377 418 Updated Sep 14, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 16,142 1,288 Updated Oct 27, 2025

TIGER-AI-Lab / Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]

Python 231 20 Updated Mar 23, 2025

OpenBMB / MiniCPM-V

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,199 1,664 Updated Sep 24, 2025

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,960 131 Updated Nov 7, 2025

stanfordmlgroup / ManyICL

Python 142 13 Updated May 23, 2024

FoundationVision / VAR

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,471 543 Updated May 18, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,081 1,850 Updated Nov 10, 2025

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,366 560 Updated Oct 19, 2024

NUS-HPC-AI-Lab / VideoSys

VideoSys: An easy and efficient system for video generation

Python 2,005 132 Updated Aug 27, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 27,804 2,760 Updated Apr 30, 2025

PKU-YuanGroup / MoE-LLaVA

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

Python 2,266 140 Updated Jul 15, 2025

LLaVA-VL / LLaVA-Plus-Codebase

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Python 760 58 Updated Feb 1, 2024

InternLM / xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 4,969 381 Updated Nov 9, 2025

PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 942 48 Updated Oct 16, 2024

SkunkworksAI / BakLLaVA

Python 714 47 Updated Mar 6, 2024

allenai / unified-io-2

Python 629 33 Updated Feb 15, 2024

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 397 37 Updated May 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yangmin09 feymanpriv

Achievements

Achievements

Block or report feymanpriv

Stars

luca-medeiros / lang-segment-anything

HJYao00 / R1-ShareVL

lupantech / AgentFlow

ShaohonChen / Qwen3-SmVL

agiresearch / OpenAGI

zhaochen0110 / Awesome_Think_With_Images

EvolvingLMMs-Lab / multimodal-search-r1

google-gemini / gemini-cli

ConardLi / easy-dataset

FanqingM / MM-Eureka-V0

VITA-MLLM / VITA

open-compass / VLMEvalKit

LLaVA-VL / LLaVA-NeXT

QwenLM / Qwen3-VL

TIGER-AI-Lab / Mantis

OpenBMB / MiniCPM-V

cambrian-mllm / cambrian

stanfordmlgroup / ManyICL

FoundationVision / VAR

NVIDIA / TensorRT-LLM

LargeWorldModel / LWM

NUS-HPC-AI-Lab / VideoSys

hpcaitech / Open-Sora

PKU-YuanGroup / MoE-LLaVA

LLaVA-VL / LLaVA-Plus-Codebase

InternLM / xtuner

PKU-YuanGroup / Chat-UniVi

SkunkworksAI / BakLLaVA

allenai / unified-io-2

RenShuhuai-Andy / TimeChat