Stars
MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…
打灰机守护程序-利用开源AI视觉模型(smolVLM2)与 MediaPipe 库,在你打灰机时保驾护航
Chinese text normalization for speech processing
Added vLLM support to IndexTTS for faster inference.
80boys / index-tts
Forked from index-tts/index-ttsAn Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Implements harmful/harmless refusal removal using pure HF Transformers
By invoking local large language models, this tool processes spreadsheets similar to multi-dimensional tables. It can batch-generate content for Excel/CSV data using AI. The tool supports simultane…
Godot tutorial collection of my YouTube videos
AingDesk是一款简单好用的AI助手,支持知识库、模型API、分享、联网搜索、智能体,它还在飞快成长中。 AingDesk is a simple and easy-to-use AI assistant that supports knowledge bases, model APIs, sharing, internet search, and intelligent agents.…
🚀 The fast, Pythonic way to build MCP servers and clients
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
LLMs-from-scratch项目中文翻译
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
SoftWhisper simplifies audio and video transcription using the powerful Whisper model. Easily select custom models, languages, and tasks, fine-tune transcription with beam size adjustment, and spec…
Algorithm for 3D printer with new kinematics
SpatialLM: Training Large Language Models for Structured Indoor Modeling
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks