Skip to content
View qinzhenyi1314's full-sized avatar

Block or report qinzhenyi1314

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 347 104 Updated May 22, 2025

🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

TypeScript 29,331 2,560 Updated Jul 2, 2025

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.

Go 145,304 12,269 Updated Jul 1, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 41,415 3,301 Updated Jul 2, 2025

Efficient Triton Kernels for LLM Training

Python 5,290 361 Updated Jul 2, 2025

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 5,609 611 Updated Jul 1, 2025

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

Python 2,580 209 Updated Jun 30, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,241 319 Updated Jun 26, 2025

A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM …

Jupyter Notebook 7,886 1,250 Updated Jun 9, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,341 6,540 Updated Jul 2, 2025

[TPAMI reviewing] Towards Visual Grounding: A Survey

Shell 186 20 Updated Jun 10, 2025

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,466 63 Updated Mar 13, 2025

A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space

Python 87 5 Updated Jan 16, 2025

An open source implementation of CLIP.

Python 12,073 1,122 Updated Jun 10, 2025

[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

444 10 Updated Jan 17, 2025

🤯 Lobe Chat - an open-source, modern design AI chat framework. Supports multiple AI providers (OpenAI / Claude 4 / Gemini / DeepSeek / Ollama / Qwen), Knowledge Base (file upload / knowledge manage…

TypeScript 63,038 13,109 Updated Jul 2, 2025

Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

Python 189 6 Updated Jan 24, 2025

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 570 44 Updated Jun 7, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,698 674 Updated Feb 10, 2025

Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

Python 333 22 Updated Mar 12, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,414 725 Updated Jul 2, 2025

This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…

Jupyter Notebook 3,389 431 Updated Jun 27, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 146,320 29,504 Updated Jul 2, 2025

Efficient Multimodal Large Language Models: A Survey

360 20 Updated Apr 29, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 6,615 567 Updated Jul 2, 2025

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization

Python 889 163 Updated Oct 28, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 5,645 533 Updated Feb 26, 2025

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.

Python 2,177 227 Updated Aug 15, 2024

Docker Extension Pack for Visual Studio Code

1,262 539 Updated Jun 4, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 16,011 1,890 Updated Dec 25, 2024
Next