VoiceGuard File Processor

A Python tool for batch processing audio files through the VoiceGuard backend system to detect AI-generated or manipulated voice content.

Set up the environment using: conda env create -f voiceguard-processor.yaml

Usage

python __main__.py /path/to/audio/directory \
    --extensions .wav .mp3 .m4a \
    --api-key YOUR_API_KEY \
    --backend-url https://your-backend-url/query \
    --output both

Command Line Arguments

Argument	Description	Default
`directory`	Directory containing files to process	Required
`--extensions`	Allowed file extensions	`.wav`
`--api-key`	API key for authentication	None (reads from `API_KEY` env var)
`--backend-url`	Backend GraphQL endpoint	Production VoiceGuard URL
`--output`	Output format: `csv`, `json`, or `both`	`csv`

Environment Variables

You can set these environment variables instead of using command line arguments:

export API_KEY="your-api-key-here"
export BACKEND_URL="https://your-backend-url/query"

Output Formats

CSV Output (`results_YYYYMMDD_HHMMSS.csv`)

Contains basic analysis results:

Column	Description
`original_filename`	Path to the original file
`file_id`	Backend system file ID
`stream_id`	Backend system stream ID
`status`	Processing status
`conclusion`	Analysis result (HUMAN, AI, INCONCLUSIVE)
`probability`	Confidence score (0.0-1.0)
`reason`	Additional details for inconclusive results

JSON Output (`results_YYYYMMDD_HHMMSS.json`)

Contains detailed analysis data including:

Complete stream metadata
Segment-by-segment analysis
Model results and preprocessing information
Timestamps and processing details

Examples

Process WAV files with CSV output

python __main__.py ./audio_samples

Process multiple formats with detailed JSON output

python __main__.py ./recordings \
    --extensions .wav .mp3 .m4a .flac \
    --output json \
    --api-key abc123

Process with both output formats

python __main__.py ./test_audio \
    --output both \
    --extensions .wav .mp3

Configuration

Authentication

For production environments, you must provide an API key:

Command line: --api-key YOUR_KEY
Environment variable: export API_KEY="YOUR_KEY"

Localhost URLs (containing localhost or 127.0.0.1) don't require authentication.

Backend URL

Default backend URL points to the production VoiceGuard API. For development or custom deployments:

python __main__.py ./audio \
    --backend-url http://localhost:8080/query

Error Handling

The tool includes comprehensive error handling:

Network errors: Automatic retries with exponential backoff
File processing errors: Logged and recorded in output files
Timeout handling: Based on audio file duration + buffer time
API errors: Detailed GraphQL error reporting

Logging

The tool provides informational logging by default:

Processing progress and status updates
Error messages and warnings
Completion summaries

Debug logging is available internally for troubleshooting.

File Processing Flow

File Discovery: Recursively scan directory for files with allowed extensions
Upload Process:
- Calculate file metadata (SHA256, MIME type, size)
- Create file blob in backend
- Upload file to storage
- Create processing job
Monitoring: Poll backend for completion status
Results: Fetch and save analysis results

Supported Audio Formats

The tool can process any audio format supported by the backend system. Common formats include:

WAV (.wav)
MP3 (.mp3)
M4A (.m4a)
FLAC (.flac)

Specify formats using the --extensions argument.

Troubleshooting

Common Issues

"No files found with allowed extensions"

Check file extensions in your directory
Verify --extensions argument matches your files

"API key must be provided"

Set API key via --api-key or API_KEY environment variable
Localhost URLs don't require API keys

"Processing timed out"

Large files may need more time
Check network connectivity
Verify backend system is responding

Audio duration detection fails

Install ffmpeg for broader format support
Check file integrity

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
__main__.py		__main__.py
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VoiceGuard File Processor

Usage

Command Line Arguments

Environment Variables

Output Formats

CSV Output (`results_YYYYMMDD_HHMMSS.csv`)

JSON Output (`results_YYYYMMDD_HHMMSS.json`)

Examples

Process WAV files with CSV output

Process multiple formats with detailed JSON output

Process with both output formats

Configuration

Authentication

Backend URL

Error Handling

Logging

File Processing Flow

Supported Audio Formats

Troubleshooting

Common Issues

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Reality-Defender/Voiceguard-Sample-Scripts

Folders and files

Latest commit

History

Repository files navigation

VoiceGuard File Processor

Usage

Command Line Arguments

Environment Variables

Output Formats

CSV Output (results_YYYYMMDD_HHMMSS.csv)

JSON Output (results_YYYYMMDD_HHMMSS.json)

Examples

Process WAV files with CSV output

Process multiple formats with detailed JSON output

Process with both output formats

Configuration

Authentication

Backend URL

Error Handling

Logging

File Processing Flow

Supported Audio Formats

Troubleshooting

Common Issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

CSV Output (`results_YYYYMMDD_HHMMSS.csv`)

JSON Output (`results_YYYYMMDD_HHMMSS.json`)

Packages