manhph2211

🎯

Focusing

Max manhph2211

🎯

Focusing

81 followers · 19 following

Singapore
13:14 (UTC +08:00)
in/manhph2211

Achievements

Stars

google-gemini / gemini-cli

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 50,427 4,276 Updated Jul 3, 2025

LVLab-SMU / HPS

Python 1 Updated Jun 3, 2025

NVIDIA / physicsnemo

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

Python 1,652 379 Updated Jul 3, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,870 218 Updated Jun 30, 2025

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

24,080 2,031 Updated May 9, 2025

helblazer811 / ConceptAttention

ConceptAttention: A method for interpreting multi-modal diffusion transformers.

Jupyter Notebook 284 12 Updated Apr 14, 2025

facebookresearch / metamorph

Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning

Python 192 7 Updated Apr 19, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,292 516 Updated May 18, 2025

deepseek-ai / DeepSeek-V3

Python 97,966 15,940 Updated Jun 27, 2025

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,466 2,642 Updated Jun 3, 2025

VisualComputingInstitute / diffusion-e2e-ft

[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Python 445 18 Updated Dec 16, 2024

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,728 331 Updated Jan 4, 2024

lucidrains / musiclm-pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

Python 3,263 265 Updated Sep 6, 2023

bytedance / SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

1,273 101 Updated Jun 20, 2025

bytedance / Make-An-Audio-2

a text-conditional diffusion probabilistic model capable of generating high fidelity audio.

Python 166 20 Updated May 29, 2024

IFICL / images-that-sound

Official repo for Images that sound: a special spectrogram that can be seen as images and played as sound generated by diffusions

Python 242 13 Updated Feb 4, 2025

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,020 243 Updated Jul 2, 2025