JigsawStack Real-Time Speech-to-Text

Real-time speech-to-text with Voice Activity Detection (VAD), powered by JigsawStack's STT API.

Features

Voice Activity Detection: Automatically detects when you start/stop speaking
Audio Accumulation: Sends growing audio clips every 1s for evolving transcriptions
Real-time Waveform: Visual feedback showing speech activity with gradient animation
Segment Management: Finalizes segments after 3 seconds of silence
Modern UI: Clean interface with final (black) and interim (gray italic) text display

Setup

Install dependencies:

npm install or yarn install

Set your JigsawStack API key in .env:

JIGSAWSTACK_API_KEY=your_api_key_here

Run both servers in separate terminals:

Terminal 1: WebSocket Server (Port 8080)

npm start or yarn start

Terminal 2: HTTP Server (Port 3000)

npm run start:http or yarn start:http

Open http://localhost:3000/index.html and allow microphone access

How It Works

VAD monitors audio energy levels to detect speech
When speech starts, audio is accumulated in a buffer
Every 1 second, accumulated audio is sent for transcription
Transcriptions evolve and improve as more audio context is added
After 3 seconds of silence, the segment is finalized
Process restarts fresh for the next speech segment

Configuration

Edit the config in index.html (line 382):

{
  language: "en",        // Language code
  encoding: "wav",       // Audio format (wav, webm, pcm16)
  interimResults: true,  // Show evolving transcriptions
  format: "text"         // Output format (text or json)
}

Project Structure

server.ts - WebSocket server handling STT requests
http-server.ts - Serves the web interface
index-vad.html - VAD-based client with waveform visualization
index.html - Simple interval-based client (no VAD)

Tech Stack

JigsawStack STT API
Web Audio API for VAD and PCM capture
WebSockets for real-time communication
TypeScript + Node.js

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
client-vad.js		client-vad.js
client.js		client.js
http-server.ts		http-server.ts
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
server.ts		server.ts
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JigsawStack Real-Time Speech-to-Text

Features

Setup

How It Works

Configuration

Project Structure

Tech Stack

About

Uh oh!

Releases

Packages

Languages

License

JigsawStack/realtime-stt-demo

Folders and files

Latest commit

History

Repository files navigation

JigsawStack Real-Time Speech-to-Text

Features

Setup

How It Works

Configuration

Project Structure

Tech Stack

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages