Skip to content
View i-MaTh's full-sized avatar
🎯
Focusing
🎯
Focusing
  • East China Normal University
  • Shanghai

Highlights

  • Pro

Block or report i-MaTh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Precision Alignment, Infinite Possibilities

Python 93 6 Updated Nov 10, 2025

Official implementation of "Continuous Autoregressive Language Models"

Python 487 60 Updated Nov 10, 2025

The official Implementation of PeriodWave and PeriodWave-Turbo

Python 210 16 Updated Apr 14, 2025

Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with synthetic captions.

Python 85 5 Updated Oct 15, 2025

VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)

Python 694 275 Updated Oct 27, 2025

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 1,934 217 Updated Nov 6, 2025

Trainging, inference, and testing of the SAC speech codec model.

Python 84 6 Updated Nov 1, 2025

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

Python 506 43 Updated Oct 29, 2025

Code for the blog "Neural audio codecs: how to get audio into LLMs"

Python 131 3 Updated Oct 20, 2025

[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching

Python 115 10 Updated Mar 27, 2025

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

813 64 Updated Oct 28, 2025

LongCat Audio Tokenizer and Detokenizer

Python 208 15 Updated Nov 11, 2025

Data Pipeline, Models, and Benchmark for Omni-Captioner.

Python 85 Updated Oct 17, 2025

PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)

C 610 103 Updated Sep 5, 2024

FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.

Python 46 6 Updated Sep 30, 2025

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 4,232 484 Updated Apr 15, 2025

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

Python 902 106 Updated Sep 13, 2025

Official implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).

Python 65 7 Updated Jun 8, 2025

Language modelling on RVQ tokens with minimal codes

Python 10 Updated Oct 10, 2025

Intelligent automation and multi-agent orchestration for Claude Code

Python 20,493 2,287 Updated Nov 8, 2025

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

TypeScript 42,057 2,777 Updated Nov 12, 2025

A comprehensive toolkit for podcast evaluation. https://arxiv.org/abs/2510.00485

JavaScript 15 Updated Nov 2, 2025

Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"

Python 91 4 Updated Oct 26, 2025

Evaluating Bias in Spoken Dialogue LLMs for Real-World Decisions and Recommendations

Python 10 Updated Nov 4, 2025

ACE-Step: A Step Towards Music Generation Foundation Model

Python 3,259 380 Updated Jun 27, 2025
Python 3 Updated Oct 6, 2025

On-device TTS model by Neuphonic

Python 3,938 389 Updated Nov 4, 2025

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

Python 334 26 Updated Oct 28, 2025

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Python 124 6 Updated Dec 9, 2024
Next