Skip to content
View JaeDukSeo's full-sized avatar
🙏
Praying
🙏
Praying

Block or report JaeDukSeo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 8 2 Updated Mar 5, 2025

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 4,362 469 Updated Nov 12, 2025

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high …

Python 707 222 Updated Nov 19, 2025

使用 NextJS + Notion API 实现的,支持多种部署方案的静态博客,无需服务器、零门槛搭建网站,为Notion和所有创作者设计。 (A static blog built with NextJS and Notion API, supporting multiple deployment options. No server required, zero threshold t…

JavaScript 10,704 13,898 Updated Oct 15, 2025

Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"

Python 1,446 102 Updated Nov 28, 2025

Declarative framework for building Agentic AI Services. Build powerful AI apps allowing Agents to navigate, interact and transact with your service.

Python 52 5 Updated Nov 25, 2025

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

Swift 1,019 117 Updated Nov 29, 2025

Fine-tune Stable Audio Open with DiT ControlNet.

Python 249 9 Updated May 16, 2025

Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Python 510 86 Updated Nov 26, 2025

TikTok Content Scraper -- No API-Key needed, minimal dependencies, citable | Download videos (MP4), slides (JPEG) and metadata of author, music, file, hashtags, content, interactions etc.

Python 60 14 Updated Sep 21, 2025

AI-powered Auto-Clipper: automatically detects highlight segments from YouTube channels, Twitch/Kick live streams (via audience-retention data, chat spikes and timestamped comments), clips them wit…

Python 5 Updated May 7, 2025

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Python 4,456 401 Updated Nov 13, 2025

Truly universal encoding detector in pure Python.

Python 719 62 Updated Nov 9, 2025

Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.

TypeScript 11,437 1,151 Updated Nov 24, 2025

Noise supression using deep filtering

Python 3,558 356 Updated Oct 17, 2024

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)

Python 933 152 Updated Nov 30, 2025

A novel media player that allows you to navigate by speaker

Svelte 67 4 Updated Nov 11, 2025

Very fast, accurate speaker diarization

Python 176 16 Updated Nov 12, 2025

⚡ Accelerate speaker diarization with Senko, processing 1 hour of audio in just 5 seconds on powerful hardware—boost your audio analysis efficiency.

Python 1 1 Updated Nov 30, 2025

LLM story writer with a focus on high-quality long output based on a user provided prompt.

Python 199 55 Updated Nov 24, 2025

TTS + Voice Cloning

Python 172 29 Updated Aug 16, 2025

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio …

Python 422 29 Updated Nov 20, 2025

Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale…

Python 618 180 Updated Jul 14, 2025

Modified version of Chatterbox that accepts text files as input and no character restrictions. I use it to make audiobooks, especially for my kids.

Python 479 84 Updated Aug 23, 2025

VLLM Port of the Chatterbox TTS model

Python 340 44 Updated Oct 18, 2025

SoTA open-source TTS

Python 14,805 2,043 Updated Sep 25, 2025

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal is…

Python 5,176 491 Updated Oct 5, 2025

Video translation and dubbing tool powered by LLMs. The video translator offers 100 language translations and one-click full-process deployment. The video translation output is optimized for platfo…

Go 8,984 749 Updated Nov 5, 2025

智能视频多语言AI配音/翻译工具 - Linly-Dubbing — “AI赋能,语言无界”

Jupyter Notebook 2,808 308 Updated Mar 5, 2025

A flask built web app that leverages the power of OpenAI's whisper model to transcribe audio and video files. Has support for various file formats. Generates timestamped .srt files.

HTML 3 1 Updated Jun 19, 2025
Next