A Node.js application that transcribes audio recordings into text and automatically saves them to Google Docs. Perfect for maintaining voice notes, meeting minutes, or any spoken content in a written format.
Create a .env file with the following variables:
OPENAI_SPEECH_API_KEY=
DOC_ID=The ID of the Google document where the text will be saved
TIMEZONE=your_timezone (example: Asia/Jerusalem)
PERSONAL_AUTH_TOKEN=Create your personal key to use when calling the API 
# Notion
NOTION_API_KEY=Your Notion integration secret
NOTION_DATABASE_ID=The Notion database ID or full database URL
# Google Service Account Credentials
TYPE=
PROJECT_ID=
PRIVATE_KEY_ID=
PRIVATE_KEY=
CLIENT_EMAIL=
CLIENT_ID=
AUTH_URI=
TOKEN_URI=
AUTH_PROVIDER_X509_CERT_URL=
CLIENT_X509_CERT_URL=
- Audio file transcription using OpenAI's Whisper model
- Automatic saving of transcriptions to Google Docs
- Timestamp recording for each transcription
- Support for M4A audio format
- RESTful API interface
- Node.js with Express.js
- OpenAI API (Whisper model for transcription)
- Google Docs API
- Multer for file upload handling
- Node.js installed
- OpenAI API key
- Google Cloud project with enabled Google Docs API
- Google Service Account credentials
- Notion integration (optional): Notion API key and database
curl -F "audio=@[path to file].m4a;type=audio/m4a" \
-H "Authorization:[your personal auth token]" \
-X POST https://[your api url]/transcribeThe app can create a Notion page for each transcription when NOTION_API_KEY and NOTION_DATABASE_ID are configured. Notion updates run in parallel with Google Docs for speed.
- Go to Notion Developers → Create a new internal integration.
- Copy the integration secret into NOTION_API_KEY.
You can use an existing database or create a new one. Share the database with your integration (Share → Invite → select your integration) so it has access.
Required/Supported properties (column names and types):
- Title property (type: Title)
- Name can be anything. The app auto-detects the Title property.
 
- Content property (type: Rich text)
- Recommended names: content,text,body,description, ornotes. The app picks the first matching Rich text property.
 
- Recommended names: 
- tags(type: Multi-select)- General tags. The app prioritizes existing options but may add new options here if needed.
- Property must be named exactly tags(case-insensitive). This prevents conflicts with similarly named columns likeproject-tags.
 
- category-tags(type: Multi-select)- High-level categories matched conservatively via AI from your transcription.
- The app will ONLY use options that already exist in this property and will NOT create new options.
- Create the options you want to be eligible, for example: dev,health.
 
- project-tags(type: Multi-select)- Project-specific tags matched conservatively via AI.
- The app will ONLY use options that already exist in this property and will NOT create new options.
- Create the options you want to be eligible, for example: smart-journal,p1v3,p1v4.
 
Notes:
- NOTION_DATABASE_IDmay be the raw 32-character ID or the full database URL. The app extracts the ID automatically (hyphens and query params are handled).
- If category-tagsorproject-tagsproperties are missing, they are simply skipped.
- Title: generated by AI to summarize the transcription.
- Content: the full transcription text (stored in the first suitable Rich text property).
- tags: generated by AI; reuses existing options when possible and may create new options if needed.
- category-tags: matched strictly against existing options; never creates new options.
- project-tags: matched strictly against existing options; never creates new options.
The app performs title generation, tags, category-tags, and project-tags extraction in parallel to reduce latency.
