Skip to content
View clownrat6's full-sized avatar
🤡
A holistic joke
🤡
A holistic joke

Highlights

  • Pro

Organizations

@DAMO-NLP-SG

Block or report clownrat6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS'25] One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Python 298 16 Updated Nov 3, 2025

Efficient Triton Kernels for LLM Training

Python 5,810 427 Updated Nov 8, 2025

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 550 33 Updated Nov 8, 2025

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,810 76 Updated Nov 6, 2025

Unified automatic quality assessment for speech, music, and sound.

Python 628 42 Updated Jun 5, 2025

VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs

Python 50 1 Updated Mar 9, 2025

哔哩哔哩常用API调用。支持视频、番剧、用户、频道、音频等功能。原仓库地址:https://github.com/MoyuScript/bilibili-api

Python 3,109 293 Updated Nov 9, 2025

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,240 4,779 Updated Jun 2, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,131 1,916 Updated Nov 1, 2025

从YouTube上爬取视频

Python 93 22 Updated Mar 24, 2020

RePO: Replay-Enhanced Policy Optimization

Python 22 1 Updated Jun 12, 2025

video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsin…

Python 108 8 Updated Oct 21, 2025

[ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

Python 57 2 Updated Dec 13, 2024

A paper list of some recent works about Token Compress for Vit and VLM

729 30 Updated Nov 5, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

8,719 582 Updated Nov 7, 2025
Python 693 12 Updated Nov 1, 2025

[CVPR 2024 Highlight] Official GraCo: Granularity-Controllable Interactive Segmentation.

Python 60 2 Updated Mar 11, 2025

Narrative movie understanding benchmark

Python 76 Updated Jun 11, 2025
Python 155 3 Updated Jan 16, 2025

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

Jupyter Notebook 130 11 Updated Oct 27, 2023

[MathCoder, MathCoder-VL] Family of LLMs/LMMs for mathematical reasoning.

Python 329 26 Updated Oct 18, 2025

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 1,384 49 Updated Oct 17, 2025

DELTA: Dense Efficient Long-range 3D Tracking for Any video (ICLR 2025)

Python 131 3 Updated Apr 6, 2025

[CVPR 2023] Official implementation of the paper: Fine-grained Audible Video Description

Python 74 5 Updated Dec 4, 2023

Official repository of "Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields", CVPR 2023 paper(highlight)

Python 79 4 Updated Apr 6, 2025

[ICLR 2025, Oral] EmbodiedSAM: Online Segment Any 3D Thing in Real Time

Python 580 27 Updated May 7, 2025
Python 545 53 Updated Sep 23, 2025

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 919 47 Updated Mar 19, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,873 302 Updated Nov 8, 2025
Next