Skip to content

feat: Update OMI Glass firmware files - Add battery setup and charging guide documentation, update firmware.ino, add build scripts and UF2 binary, update source configuration files #2565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 36 commits into
base: development
Choose a base branch
from

Conversation

cyhuman
Copy link
Contributor

@cyhuman cyhuman commented Jun 19, 2025

cyhuman and others added 30 commits June 10, 2025 11:11
…ation integration" (BasedHardware#2526)

This reverts commit 24ca2d5, reversing
changes made to d2b723d.
…egration (BasedHardware#2528)

## Overview
This PR implements OpenGlass device integration for image capture,
processing, and conversation linking within the Omi ecosystem. The
system processes images captured during conversations, analyzes them
using GPT-4o Vision API, and integrates visual context into conversation
transcripts and summaries.

## Testing Status
**Platform**: iOS only - Android compatibility not tested
**Device**: OpenGlass hardware integration verified on iOS devices

## Core Architecture

### Image Processing Pipeline
- **AI Analysis**: GPT-4o Vision API integration with custom prompts for
contextual image understanding
- **Cloud Storage**: Google Cloud Storage integration with automatic
thumbnail generation using PIL
- **Duplicate Detection**: Jaccard similarity algorithm (0.8 threshold)
comparing image descriptions to prevent redundant processing
- **Concurrent Processing**: Asyncio-based concurrent image analysis
with configurable 30-second timeouts
- **Error Handling**: Exception handling for API failures, network
issues, and processing timeouts

### Conversation Integration
- **Temporal Linking**: Associates images with conversations using
10-minute time windows before and after image capture
- **Multi-device Support**: Links images across different connected
devices (Frame, Omi) for the same user account
- **Retroactive Processing**: `_execute_conversation_reprocessing`
function combines existing transcripts with newly added images
- **Structured Analysis**:
`get_combined_transcript_and_photos_structure` function creates
conversation summaries incorporating both audio and visual elements

### Database Implementation
- **ConversationPhoto Model**: Extended Pydantic model supporting both
legacy base64 encoding and cloud storage URLs
- **Firestore Integration**: Datetime handling for both timestamp
integers and ISO string formats
- **Photo Storage**: `/v1/conversations/{id}/photos` endpoint for
retrieving conversation images
- **Metadata Tracking**: Image capture time, processing status, and
storage location tracking

## Frontend Implementation

### Capture Interface
- **OpenGlass Mixin** (`openglass_mixin.dart`): Core capture logic with
photo display and upload functionality
- **Capture UI** (`conversation_capturing/page.dart`): 774-line
implementation with integrated photo timeline and status indicators
- **Photo Widgets** (`capture/widgets/widgets.dart`): UI components for
image display during conversations
- **Processing Indicators**: Status updates for image upload, analysis,
and integration progress

### Device Management
- **Device Connection**: Extended `frame_connection.dart` and
`omi_connection.dart` with OpenGlass communication protocols
- **Connection Monitoring**: Battery and connection status reporting for
Frame devices
- **Device UI** (`home/device.dart`): 314-line redesign of device
management interface

### Real-time Communication
- **WebSocket Integration**: Enhanced `transcription_connection.dart`
with image handling for real-time photo streaming
- **Pusher Integration**: WebSocket communication for live image updates
during conversations
- **Live Photo Timeline**: Real-time photo display in conversation
interfaces

## Backend Implementation

### API Endpoints
- **Enhanced `/v2/files` Endpoint**: File upload handler with OpenGlass
image detection and processing triggers
- **Photo Retrieval API**: `/v1/conversations/{id}/photos` endpoint with
authentication and error handling
- **WebSocket Routes**: Enhanced `pusher.py` router with image streaming
capabilities
- **Conversation Reprocessing**: API endpoints for retroactive
conversation enhancement with image context

### Processing Logic
- **Image Detection**: Automatic OpenGlass image identification in file
upload pipeline
- **Conversation Linking**: `_link_images_to_recent_conversation`
function with temporal matching
- **Concurrent Analysis**: Parallel processing of multiple images with
resource management
- **Fallback Mechanisms**: Graceful degradation when AI analysis fails
or times out

### Storage Management
- **Cloud Storage**: Google Cloud Storage with automatic thumbnail
generation
- **CDN Delivery**: Image delivery through cloud CDN infrastructure
- **Legacy Support**: Support for base64 image storage during transition
period
- **Cleanup**: Cleanup of temporary files and failed uploads

## Technical Implementation

### Error Handling
- **Firestore Timestamps**: Unified handling of integer timestamps and
ISO string formats
- **Pydantic Validation**: Datetime object to string conversion in
`get_conversation_photos`
- **API Timeouts**: 30-second timeouts for GPT-4o Vision API calls with
exception handling
- **Network Resilience**: Retry logic for transient network failures and
API rate limits

### Performance
- **Concurrent Processing**: Asyncio-based parallel image analysis
- **Thumbnail Generation**: Automatic thumbnail creation for faster
loading
- **Database Queries**: Efficient Firestore queries with indexing
considerations
- **Memory Management**: Cleanup of image processing operations and
temporary files

### Data Consistency
- **DateTime Handling**: Consistent UTC datetime handling across image
and conversation operations
- **Atomic Updates**: Conversation reprocessing with atomic database
updates
- **Validation**: Multiple validation layers for image data,
conversation linking, and storage operations

## Code Quality
- **Debug Cleanup**: Removed 65+ debug statements from production code
- **Bug Fixes**: Fixed NameError in `process_conversation.py` affecting
OpenGlass processing
- **Error Handling**: Production-ready error handling and logging
- **Code Structure**: Clean separation of concerns across components
…egration (BasedHardware#2527)

## Overview
This PR implements OpenGlass device integration for image capture,
processing, and conversation linking within the Omi ecosystem. The
system processes images captured during conversations, analyzes them
using GPT-4o Vision API, and integrates visual context into conversation
transcripts and summaries.

## Testing Status
**Platform**: iOS only - Android compatibility not tested
**Device**: OpenGlass hardware integration verified on iOS devices

## Core Architecture

### Image Processing Pipeline
- **AI Analysis**: GPT-4o Vision API integration with custom prompts for
contextual image understanding
- **Cloud Storage**: Google Cloud Storage integration with automatic
thumbnail generation using PIL
- **Duplicate Detection**: Jaccard similarity algorithm (0.8 threshold)
comparing image descriptions to prevent redundant processing
- **Concurrent Processing**: Asyncio-based concurrent image analysis
with configurable 30-second timeouts
- **Error Handling**: Exception handling for API failures, network
issues, and processing timeouts

### Conversation Integration
- **Temporal Linking**: Associates images with conversations using
10-minute time windows before and after image capture
- **Multi-device Support**: Links images across different connected
devices (Frame, Omi) for the same user account
- **Retroactive Processing**: `_execute_conversation_reprocessing`
function combines existing transcripts with newly added images
- **Structured Analysis**:
`get_combined_transcript_and_photos_structure` function creates
conversation summaries incorporating both audio and visual elements

### Database Implementation
- **ConversationPhoto Model**: Extended Pydantic model supporting both
legacy base64 encoding and cloud storage URLs
- **Firestore Integration**: Datetime handling for both timestamp
integers and ISO string formats
- **Photo Storage**: `/v1/conversations/{id}/photos` endpoint for
retrieving conversation images
- **Metadata Tracking**: Image capture time, processing status, and
storage location tracking

## Frontend Implementation

### Capture Interface
- **OpenGlass Mixin** (`openglass_mixin.dart`): Core capture logic with
photo display and upload functionality
- **Capture UI** (`conversation_capturing/page.dart`): 774-line
implementation with integrated photo timeline and status indicators
- **Photo Widgets** (`capture/widgets/widgets.dart`): UI components for
image display during conversations
- **Processing Indicators**: Status updates for image upload, analysis,
and integration progress

### Device Management
- **Device Connection**: Extended `frame_connection.dart` and
`omi_connection.dart` with OpenGlass communication protocols
- **Connection Monitoring**: Battery and connection status reporting for
Frame devices
- **Device UI** (`home/device.dart`): 314-line redesign of device
management interface

### Real-time Communication
- **WebSocket Integration**: Enhanced `transcription_connection.dart`
with image handling for real-time photo streaming
- **Pusher Integration**: WebSocket communication for live image updates
during conversations
- **Live Photo Timeline**: Real-time photo display in conversation
interfaces

## Backend Implementation

### API Endpoints
- **Enhanced `/v2/files` Endpoint**: File upload handler with OpenGlass
image detection and processing triggers
- **Photo Retrieval API**: `/v1/conversations/{id}/photos` endpoint with
authentication and error handling
- **WebSocket Routes**: Enhanced `pusher.py` router with image streaming
capabilities
- **Conversation Reprocessing**: API endpoints for retroactive
conversation enhancement with image context

### Processing Logic
- **Image Detection**: Automatic OpenGlass image identification in file
upload pipeline
- **Conversation Linking**: `_link_images_to_recent_conversation`
function with temporal matching
- **Concurrent Analysis**: Parallel processing of multiple images with
resource management
- **Fallback Mechanisms**: Graceful degradation when AI analysis fails
or times out

### Storage Management
- **Cloud Storage**: Google Cloud Storage with automatic thumbnail
generation
- **CDN Delivery**: Image delivery through cloud CDN infrastructure
- **Legacy Support**: Support for base64 image storage during transition
period
- **Cleanup**: Cleanup of temporary files and failed uploads

## Technical Implementation

### Error Handling
- **Firestore Timestamps**: Unified handling of integer timestamps and
ISO string formats
- **Pydantic Validation**: Datetime object to string conversion in
`get_conversation_photos`
- **API Timeouts**: 30-second timeouts for GPT-4o Vision API calls with
exception handling
- **Network Resilience**: Retry logic for transient network failures and
API rate limits

### Performance
- **Concurrent Processing**: Asyncio-based parallel image analysis
- **Thumbnail Generation**: Automatic thumbnail creation for faster
loading
- **Database Queries**: Efficient Firestore queries with indexing
considerations
- **Memory Management**: Cleanup of image processing operations and
temporary files

### Data Consistency
- **DateTime Handling**: Consistent UTC datetime handling across image
and conversation operations
- **Atomic Updates**: Conversation reprocessing with atomic database
updates
- **Validation**: Multiple validation layers for image data,
conversation linking, and storage operations

## Code Quality
- **Debug Cleanup**: Removed 65+ debug statements from production code
- **Bug Fixes**: Fixed NameError in `process_conversation.py` affecting
OpenGlass processing
- **Error Handling**: Production-ready error handling and logging
- **Code Structure**: Clean separation of concerns across components
…t protection for conversationCreating flag (30s safety timer) - Implement smart image throttling (allow up to 5 images during creation) - Add automatic recovery mechanisms for stuck conversation states - Reorganize live conversation UI: images gallery at top, transcript below - Use consistent grey styling for both image gallery and transcript boxes - Fix OpenGlass conversations to use same smart tags as audio conversations - Add proper error handling and failsafe timers for robust operation
## Critical Fixes

### Deadlock Prevention
- Add 30-second timeout protection for `conversationCreating` flag
- Implement smart image throttling (allow up to 5 images during
creation)
- Add automatic recovery mechanisms and failsafe timers
- Fix error handling to prevent permanent stuck states

### UI Improvements  
- Reorganize live conversation layout: images gallery at top, transcript
below
- Use consistent grey styling (`Colors.grey.shade900`) for all boxes
- Priority to "Capturing" state even when Glasses are connected

### OpenGlass Enhancement
- Fix OpenGlass conversations to use same smart tags as audio
conversations
- Remove simplified processing, use full transcript structure analysis
…I consistency - unified content logic, proper categorization, and modular capture status
…I consistency (BasedHardware#2531)

## Problem
- OpenGlass photo-only conversations failed auto-summarization with
"transcript is empty" error
- Generic "Openglass" tags shown instead of actual conversation
categories
- Inconsistent device status display when multiple devices active

## Solution
- Unified content detection logic for both auto-summarizer and apps
- Treat photo descriptions as valid content source equivalent to
transcripts
- Remove device-specific tags to show actual conversation categories
- Prioritize device status (glasses > phone mic) for consistent
"Capturing" display

## Changes
**Backend:**
- Add unified `has_photos` and `has_transcript` validation logic
- Fix auto-summarizer fallback to handle photo-only conversations
- Ensure apps and auto-summarizer use identical content processing

**Frontend:**
- Remove generic "Openglass" tag logic from conversation schema
- Update device status priority in processing capture widget
- Modular capture status logic for current and future device types
…processing - Created dedicated OpenGlass router, extracted photo processing utilities, fixed image-audio transition edge cases, reduced code duplication by 200+ lines
… Created backend/utils/conversations/websocket_utils.py with Redis handling, conversation transitions, and datetime parsing - Eliminated inline imports and cleaned up 60+ lines of embedded logic - Fixed missing websockets.exceptions import - Maintains 100% backward compatibility with all endpoints and function signatures - Zero breaking changes, frontend unaffected
… misaligned print and continue statements in exception handler - All syntax errors resolved, backend can now start properly
… router from 900+ to 585 lines, removed circuit breakers and health monitoring, kept essential functionality
##  **Files Changed**
- `backend/routers/openglass.py` (simplified 900+ → 585 lines)
- `backend/database/redis_db.py`
- `backend/utils/conversations/process_conversation.py` (simplified
session management)
- `backend/main.py` (OmiGlass router registration)
- `backend/utils/other/storage.py` (dedicated OmiGlass storage)
- `backend/routers/chat.py` (separated concerns)
- `app/lib/providers/device_provider.dart` (correct endpoint usage)
…r app to use 'omiglass_' prefix instead of 'openglass_' to match backend validation - Fix indentation error in chat.py exception handling block
…ve utility functions from routers to utils, add Redis helpers, remove direct Redis access, include conversations.py updates
mdmohsin7 and others added 6 commits June 11, 2025 16:10
…eConversation - Prevents AttributeError when processing external integration conversations - Applies getattr() to 5 locations in _get_structured and _get_conversation_obj - Minimal fix ensuring ExternalIntegrationCreateConversation processes without crashes
…g guide documentation, update firmware.ino, add build scripts and UF2 binary, update source configuration files
Copy link

vercel bot commented Jun 19, 2025

@cyhuman is attempting to deploy a commit to the kodjima33's projects Team on Vercel.

A member of the Team first needs to authorize it.

@beastoin
Copy link
Collaborator

2427

@beastoin
Copy link
Collaborator

beastoin commented Jun 23, 2025

since i can’t run the firmware’s functionality with the bare xiao esp32s3 board i have, please:

  • either wait until i have the full omiglass in hand (weeks)
  • or give me clear instructions on how to build the current setup - including any additional materials, wiring, or schematic (if it’s complex) (days)

btw, the omi app now supports the omiglass (tested with my board) - check the latest TestFlight / internal test builds.

PR: #2580

@cyhuman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants