Video Auto Index

Automatically process meeting videos to extract key moments, topics, and takeaways. The tool generates a navigable summary webpage with video chapters.

Features

Extracts audio from video files
Transcribes speech to text using insanely-fast-whisper
- Uses Whisper large-v3 model by default for best accuracy
- Optimized for both NVIDIA GPUs and Apple Silicon
- Flash Attention 2 support for NVIDIA GPUs
- Word-level timestamp support
Analyzes content to identify:
- Major topics with timestamps
- Key moments within each topic
- Actionable takeaways
Adds chapter markers to the video
Generates an interactive HTML summary
- Clickable timestamps for video navigation
- Organized by topics
- Highlights key moments and takeaways

Requirements

Python 3.11+
ffmpeg
NVIDIA GPU or Apple Silicon Mac
Conda (for environment management)

Installation

Install ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

Set up the conda environment:

# Create and activate the conda environment
conda env create -f environment.yml
conda activate video-auto-index

This will automatically install all dependencies, including insanely-fast-whisper.

Set up your Anthropic API key:
```
export ANTHROPIC_API_KEY='your-api-key'
```

Usage

Process a video file:

python -m src.main [video path] [--output-dir output] [--device-id DEVICE]

Device options:

Default: Automatically uses MPS on Apple Silicon, CPU/CUDA on other systems
--device-id 0: Force CPU/CUDA device
--device-id mps: Force MPS device on Apple Silicon

For example:

# Use default device (auto-detected)
python -m src.main /path/to/video.mp4 --output-dir output

# Force CPU/CUDA device
python -m src.main /path/to/video.mp4 --output-dir output --device-id 0

# Force MPS device on Apple Silicon
python -m src.main /path/to/video.mp4 --output-dir output --device-id mps

This will:

Extract audio from the video
Transcribe the audio using insanely-fast-whisper
Analyze the meeting content for topics and key moments
Generate an HTML summary page

The output directory will contain:

audio.wav: Extracted audio file
audio_transcript.json: Transcribed speech with timestamps
audio_subtitles.srt: Generated subtitles
meeting_analysis.json: Extracted topics, moments, and takeaways

Final web output is stored in:

<video_base_path>/<video_filename>_summary.html

Output Format

The analysis JSON follows this structure:

[
  {
    "topic": "Topic description",
    "timestamp": "HH:MM:SS,mmm",
    "key_moments": [
      {
        "description": "Key moment description",
        "timestamp": "HH:MM:SS,mmm"
      }
    ],
    "takeaways": [
      "Actionable takeaway 1",
      "Actionable takeaway 2"
    ]
  }
]

Development

The project is organized into modular components:

video_processor.py: Handles video/audio operations
transcriber.py: Speech-to-text conversion using insanely-fast-whisper
key_moments.py: AI content analysis
web_generator.py: HTML summary generation
main.py: Pipeline orchestration

Each component can be run independently, allowing for flexible processing pipelines.

Testing

The project includes a comprehensive test suite covering all components:

Running Tests

Run all tests:

pytest

Run with coverage report:

pytest --cov=src tests/

Run specific test categories:

# Unit tests only
pytest -v -m "not integration"

# Integration tests only
pytest -v -m "integration"

Test Structure

test_video_processor.py: Tests for video and audio processing
test_transcriber.py: Tests for speech-to-text conversion
test_key_moments.py: Tests for content analysis with API mocking
test_web_generator.py: Tests for HTML generation
test_main.py: Integration tests for the full pipeline

Test Coverage

The test suite includes:

Unit tests for each component
Integration tests for the full pipeline
API mocking for external services
Fixture-based test data
Error handling verification
Edge case validation

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Ensure all tests pass
Submit a pull request

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
src		src
test_data		test_data
tests		tests
videos		videos
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
plan.txt		plan.txt
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video Auto Index

Features

Requirements

Installation

Usage

Output Format

Development

Testing

Running Tests

Test Structure

Test Coverage

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

SteampunkDigital/VideoAutoIndex

Folders and files

Latest commit

History

Repository files navigation

Video Auto Index

Features

Requirements

Installation

Usage

Output Format

Development

Testing

Running Tests

Test Structure

Test Coverage

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages