A modern Next.js application that automatically generates videos with subtitles from audio files using AI-powered transcription. Perfect for content creators, educators, and anyone looking to enhance video accessibility with professional-quality subtitles.
- 🎵 Audio Processing: Support for MP3, WAV, and M4A audio files with automatic transcription
- 🖼️ Custom Backgrounds: Add optional background images to enhance visual appeal
- 🤖 AI-Powered Transcription: Automatic speech-to-text using OpenAI's Whisper API
- ✂️ Subtitle Editor: Advanced subtitle editing with timing adjustments and text modifications
- 🎬 Video Generation: Create professional videos with perfectly synchronized subtitles
- 📱 Responsive Design: Modern, mobile-first interface built with shadcn/ui and Tailwind CSS
- ⚡ Real-time Processing: Live progress tracking with detailed processing steps
- 📥 Easy Export: Download generated videos in multiple formats
- 🐳 Docker Support: Complete containerization with ffmpeg pre-installed
- 🔧 Environment Validation: Built-in API key and dependency checks
-
Clone the repository:
git clone https://github.com/yourusername/auto-subtitle.git cd auto-subtitle -
Install Bun (if not already installed):
curl -fsSL https://bun.sh/install | bash -
Install dependencies:
bun install
-
Set up environment variables: Create a
.env.localfile:OPENAI_API_KEY=your_openai_api_key_here
-
Start development server:
bun dev
-
Open your browser: Navigate to http://localhost:3000
For production deployment or isolated development environment:
# Using Docker Compose (recommended)
docker-compose -f docker/docker-compose.yml up --build
# Or build manually
docker build -f docker/Dockerfile -t auto-subtitle .
docker run -p 3000:3000 -e OPENAI_API_KEY=your_key auto-subtitle📋 See the Docker documentation for detailed setup instructions.
- 🎵 Upload Audio: Select your audio file (MP3, WAV, or M4A)
- 🖼️ Add Background (Optional): Upload a background image for visual enhancement
- ⚙️ Configure Settings: Adjust subtitle appearance and timing preferences
- 🚀 Generate: Click "Generate Video" to start the AI transcription process
- ✏️ Edit Subtitles: Use the built-in editor to fine-tune text and timing
- 🎬 Preview & Download: Review your video and download the final result
- Next.js 15 - React framework with App Router
- React 19 - Latest React with concurrent features
- TypeScript - Type-safe development
- Tailwind CSS 4 - Utility-first CSS framework
- shadcn/ui - High-quality UI components
- React Hook Form - Performant form handling
- Zod - Runtime type validation
- Bun - Fast JavaScript runtime and package manager
- OpenAI Whisper API - AI-powered transcription
- FFmpeg - Video processing and subtitle rendering
- Next.js API Routes - Serverless backend functions
- Edge Runtime - Optimized for performance
- Docker - Containerization with multi-stage builds
- Docker Compose - Development environment orchestration
- Alpine Linux - Lightweight base images
- Health Checks - Container monitoring and reliability
auto-subtitle/
├── app/ # Next.js App Router
│ ├── api/ # Backend API routes
│ │ ├── generate/ # Video generation endpoint
│ │ ├── subtitles/ # Subtitle processing
│ │ └── env-check/ # Environment validation
│ ├── components/ # App-specific components
│ │ ├── SubtitleEditor.tsx # Advanced subtitle editing
│ │ ├── ProcessingSteps.tsx # Progress tracking
│ │ └── EnvironmentCheck.tsx # API validation
│ ├── lib/ # App utilities
│ ├── utils/ # Helper functions
│ ├── layout.tsx # Root layout
│ ├── page.tsx # Main application
│ └── globals.css # Global styles
├── components/ # Reusable UI components
│ └── ui/ # shadcn/ui components
├── docker/ # Docker configuration
│ ├── Dockerfile # Production Docker image
│ ├── Dockerfile.alternative # Alternative build strategy
│ ├── docker-compose.yml # Container orchestration
│ ├── build-docker.sh # Build helper script
│ └── README.md # Docker documentation
├── lib/ # Shared utilities and libraries
├── public/ # Static assets
├── types/ # TypeScript definitions
├── .dockerignore # Docker build exclusions
├── bun.lockb # Bun dependency lock
├── next.config.ts # Next.js configuration
└── tailwind.config.ts # Tailwind CSS configuration
Required environment variables:
OPENAI_API_KEY- Your OpenAI API key for Whisper transcriptionNODE_ENV- Environment mode (development|production)
The app is configured with:
- Standalone output for optimal Docker deployment
- External packages configuration for ffmpeg compatibility
- App Router for modern React patterns
This project includes comprehensive Docker support:
- Multi-stage builds for optimized production images
- Bun runtime for faster performance
- FFmpeg pre-installed for video processing
- Security best practices with non-root user
- Health checks for container monitoring
- Volume mounts for persistent file storage
We welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests if applicable
- Commit your changes:
git commit -m 'Add amazing feature' - Push to your branch:
git push origin feature/amazing-feature - Open a Pull Request
- Use TypeScript for type safety
- Follow the existing code style and patterns
- Add appropriate error handling
- Update documentation for new features
- Test Docker builds before submitting
This project is licensed under the MIT License.
- OpenAI - For the powerful Whisper API
- FFmpeg - For robust video processing
- Bun - For lightning-fast JavaScript runtime
- shadcn/ui - For beautiful UI components
- Vercel - For Next.js framework and deployment platform
🚀 Get Started • 📚 Documentation • 🐛 Report Bug • 💡 Request Feature
Made with ❤️ for content creators worldwide