-
Notifications
You must be signed in to change notification settings - Fork 846
feat: Update OMI Glass firmware files - Add battery setup and charging guide documentation, update firmware.ino, add build scripts and UF2 binary, update source configuration files #2565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cyhuman
wants to merge
36
commits into
BasedHardware:development
Choose a base branch
from
cyhuman:firmware-updates
base: development
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ation integration" (BasedHardware#2526) This reverts commit 24ca2d5, reversing changes made to d2b723d.
…egration (BasedHardware#2528) ## Overview This PR implements OpenGlass device integration for image capture, processing, and conversation linking within the Omi ecosystem. The system processes images captured during conversations, analyzes them using GPT-4o Vision API, and integrates visual context into conversation transcripts and summaries. ## Testing Status **Platform**: iOS only - Android compatibility not tested **Device**: OpenGlass hardware integration verified on iOS devices ## Core Architecture ### Image Processing Pipeline - **AI Analysis**: GPT-4o Vision API integration with custom prompts for contextual image understanding - **Cloud Storage**: Google Cloud Storage integration with automatic thumbnail generation using PIL - **Duplicate Detection**: Jaccard similarity algorithm (0.8 threshold) comparing image descriptions to prevent redundant processing - **Concurrent Processing**: Asyncio-based concurrent image analysis with configurable 30-second timeouts - **Error Handling**: Exception handling for API failures, network issues, and processing timeouts ### Conversation Integration - **Temporal Linking**: Associates images with conversations using 10-minute time windows before and after image capture - **Multi-device Support**: Links images across different connected devices (Frame, Omi) for the same user account - **Retroactive Processing**: `_execute_conversation_reprocessing` function combines existing transcripts with newly added images - **Structured Analysis**: `get_combined_transcript_and_photos_structure` function creates conversation summaries incorporating both audio and visual elements ### Database Implementation - **ConversationPhoto Model**: Extended Pydantic model supporting both legacy base64 encoding and cloud storage URLs - **Firestore Integration**: Datetime handling for both timestamp integers and ISO string formats - **Photo Storage**: `/v1/conversations/{id}/photos` endpoint for retrieving conversation images - **Metadata Tracking**: Image capture time, processing status, and storage location tracking ## Frontend Implementation ### Capture Interface - **OpenGlass Mixin** (`openglass_mixin.dart`): Core capture logic with photo display and upload functionality - **Capture UI** (`conversation_capturing/page.dart`): 774-line implementation with integrated photo timeline and status indicators - **Photo Widgets** (`capture/widgets/widgets.dart`): UI components for image display during conversations - **Processing Indicators**: Status updates for image upload, analysis, and integration progress ### Device Management - **Device Connection**: Extended `frame_connection.dart` and `omi_connection.dart` with OpenGlass communication protocols - **Connection Monitoring**: Battery and connection status reporting for Frame devices - **Device UI** (`home/device.dart`): 314-line redesign of device management interface ### Real-time Communication - **WebSocket Integration**: Enhanced `transcription_connection.dart` with image handling for real-time photo streaming - **Pusher Integration**: WebSocket communication for live image updates during conversations - **Live Photo Timeline**: Real-time photo display in conversation interfaces ## Backend Implementation ### API Endpoints - **Enhanced `/v2/files` Endpoint**: File upload handler with OpenGlass image detection and processing triggers - **Photo Retrieval API**: `/v1/conversations/{id}/photos` endpoint with authentication and error handling - **WebSocket Routes**: Enhanced `pusher.py` router with image streaming capabilities - **Conversation Reprocessing**: API endpoints for retroactive conversation enhancement with image context ### Processing Logic - **Image Detection**: Automatic OpenGlass image identification in file upload pipeline - **Conversation Linking**: `_link_images_to_recent_conversation` function with temporal matching - **Concurrent Analysis**: Parallel processing of multiple images with resource management - **Fallback Mechanisms**: Graceful degradation when AI analysis fails or times out ### Storage Management - **Cloud Storage**: Google Cloud Storage with automatic thumbnail generation - **CDN Delivery**: Image delivery through cloud CDN infrastructure - **Legacy Support**: Support for base64 image storage during transition period - **Cleanup**: Cleanup of temporary files and failed uploads ## Technical Implementation ### Error Handling - **Firestore Timestamps**: Unified handling of integer timestamps and ISO string formats - **Pydantic Validation**: Datetime object to string conversion in `get_conversation_photos` - **API Timeouts**: 30-second timeouts for GPT-4o Vision API calls with exception handling - **Network Resilience**: Retry logic for transient network failures and API rate limits ### Performance - **Concurrent Processing**: Asyncio-based parallel image analysis - **Thumbnail Generation**: Automatic thumbnail creation for faster loading - **Database Queries**: Efficient Firestore queries with indexing considerations - **Memory Management**: Cleanup of image processing operations and temporary files ### Data Consistency - **DateTime Handling**: Consistent UTC datetime handling across image and conversation operations - **Atomic Updates**: Conversation reprocessing with atomic database updates - **Validation**: Multiple validation layers for image data, conversation linking, and storage operations ## Code Quality - **Debug Cleanup**: Removed 65+ debug statements from production code - **Bug Fixes**: Fixed NameError in `process_conversation.py` affecting OpenGlass processing - **Error Handling**: Production-ready error handling and logging - **Code Structure**: Clean separation of concerns across components
…egration (BasedHardware#2527) ## Overview This PR implements OpenGlass device integration for image capture, processing, and conversation linking within the Omi ecosystem. The system processes images captured during conversations, analyzes them using GPT-4o Vision API, and integrates visual context into conversation transcripts and summaries. ## Testing Status **Platform**: iOS only - Android compatibility not tested **Device**: OpenGlass hardware integration verified on iOS devices ## Core Architecture ### Image Processing Pipeline - **AI Analysis**: GPT-4o Vision API integration with custom prompts for contextual image understanding - **Cloud Storage**: Google Cloud Storage integration with automatic thumbnail generation using PIL - **Duplicate Detection**: Jaccard similarity algorithm (0.8 threshold) comparing image descriptions to prevent redundant processing - **Concurrent Processing**: Asyncio-based concurrent image analysis with configurable 30-second timeouts - **Error Handling**: Exception handling for API failures, network issues, and processing timeouts ### Conversation Integration - **Temporal Linking**: Associates images with conversations using 10-minute time windows before and after image capture - **Multi-device Support**: Links images across different connected devices (Frame, Omi) for the same user account - **Retroactive Processing**: `_execute_conversation_reprocessing` function combines existing transcripts with newly added images - **Structured Analysis**: `get_combined_transcript_and_photos_structure` function creates conversation summaries incorporating both audio and visual elements ### Database Implementation - **ConversationPhoto Model**: Extended Pydantic model supporting both legacy base64 encoding and cloud storage URLs - **Firestore Integration**: Datetime handling for both timestamp integers and ISO string formats - **Photo Storage**: `/v1/conversations/{id}/photos` endpoint for retrieving conversation images - **Metadata Tracking**: Image capture time, processing status, and storage location tracking ## Frontend Implementation ### Capture Interface - **OpenGlass Mixin** (`openglass_mixin.dart`): Core capture logic with photo display and upload functionality - **Capture UI** (`conversation_capturing/page.dart`): 774-line implementation with integrated photo timeline and status indicators - **Photo Widgets** (`capture/widgets/widgets.dart`): UI components for image display during conversations - **Processing Indicators**: Status updates for image upload, analysis, and integration progress ### Device Management - **Device Connection**: Extended `frame_connection.dart` and `omi_connection.dart` with OpenGlass communication protocols - **Connection Monitoring**: Battery and connection status reporting for Frame devices - **Device UI** (`home/device.dart`): 314-line redesign of device management interface ### Real-time Communication - **WebSocket Integration**: Enhanced `transcription_connection.dart` with image handling for real-time photo streaming - **Pusher Integration**: WebSocket communication for live image updates during conversations - **Live Photo Timeline**: Real-time photo display in conversation interfaces ## Backend Implementation ### API Endpoints - **Enhanced `/v2/files` Endpoint**: File upload handler with OpenGlass image detection and processing triggers - **Photo Retrieval API**: `/v1/conversations/{id}/photos` endpoint with authentication and error handling - **WebSocket Routes**: Enhanced `pusher.py` router with image streaming capabilities - **Conversation Reprocessing**: API endpoints for retroactive conversation enhancement with image context ### Processing Logic - **Image Detection**: Automatic OpenGlass image identification in file upload pipeline - **Conversation Linking**: `_link_images_to_recent_conversation` function with temporal matching - **Concurrent Analysis**: Parallel processing of multiple images with resource management - **Fallback Mechanisms**: Graceful degradation when AI analysis fails or times out ### Storage Management - **Cloud Storage**: Google Cloud Storage with automatic thumbnail generation - **CDN Delivery**: Image delivery through cloud CDN infrastructure - **Legacy Support**: Support for base64 image storage during transition period - **Cleanup**: Cleanup of temporary files and failed uploads ## Technical Implementation ### Error Handling - **Firestore Timestamps**: Unified handling of integer timestamps and ISO string formats - **Pydantic Validation**: Datetime object to string conversion in `get_conversation_photos` - **API Timeouts**: 30-second timeouts for GPT-4o Vision API calls with exception handling - **Network Resilience**: Retry logic for transient network failures and API rate limits ### Performance - **Concurrent Processing**: Asyncio-based parallel image analysis - **Thumbnail Generation**: Automatic thumbnail creation for faster loading - **Database Queries**: Efficient Firestore queries with indexing considerations - **Memory Management**: Cleanup of image processing operations and temporary files ### Data Consistency - **DateTime Handling**: Consistent UTC datetime handling across image and conversation operations - **Atomic Updates**: Conversation reprocessing with atomic database updates - **Validation**: Multiple validation layers for image data, conversation linking, and storage operations ## Code Quality - **Debug Cleanup**: Removed 65+ debug statements from production code - **Bug Fixes**: Fixed NameError in `process_conversation.py` affecting OpenGlass processing - **Error Handling**: Production-ready error handling and logging - **Code Structure**: Clean separation of concerns across components
…t protection for conversationCreating flag (30s safety timer) - Implement smart image throttling (allow up to 5 images during creation) - Add automatic recovery mechanisms for stuck conversation states - Reorganize live conversation UI: images gallery at top, transcript below - Use consistent grey styling for both image gallery and transcript boxes - Fix OpenGlass conversations to use same smart tags as audio conversations - Add proper error handling and failsafe timers for robust operation
…fix transcript color consistency
…h phone mic active
## Critical Fixes ### Deadlock Prevention - Add 30-second timeout protection for `conversationCreating` flag - Implement smart image throttling (allow up to 5 images during creation) - Add automatic recovery mechanisms and failsafe timers - Fix error handling to prevent permanent stuck states ### UI Improvements - Reorganize live conversation layout: images gallery at top, transcript below - Use consistent grey styling (`Colors.grey.shade900`) for all boxes - Priority to "Capturing" state even when Glasses are connected ### OpenGlass Enhancement - Fix OpenGlass conversations to use same smart tags as audio conversations - Remove simplified processing, use full transcript structure analysis
… for proper categorization
…I consistency - unified content logic, proper categorization, and modular capture status
…I consistency (BasedHardware#2531) ## Problem - OpenGlass photo-only conversations failed auto-summarization with "transcript is empty" error - Generic "Openglass" tags shown instead of actual conversation categories - Inconsistent device status display when multiple devices active ## Solution - Unified content detection logic for both auto-summarizer and apps - Treat photo descriptions as valid content source equivalent to transcripts - Remove device-specific tags to show actual conversation categories - Prioritize device status (glasses > phone mic) for consistent "Capturing" display ## Changes **Backend:** - Add unified `has_photos` and `has_transcript` validation logic - Fix auto-summarizer fallback to handle photo-only conversations - Ensure apps and auto-summarizer use identical content processing **Frontend:** - Remove generic "Openglass" tag logic from conversation schema - Update device status priority in processing capture widget - Modular capture status logic for current and future device types
…processing - Created dedicated OpenGlass router, extracted photo processing utilities, fixed image-audio transition edge cases, reduced code duplication by 200+ lines
… Created backend/utils/conversations/websocket_utils.py with Redis handling, conversation transitions, and datetime parsing - Eliminated inline imports and cleaned up 60+ lines of embedded logic - Fixed missing websockets.exceptions import - Maintains 100% backward compatibility with all endpoints and function signatures - Zero breaking changes, frontend unaffected
… misaligned print and continue statements in exception handler - All syntax errors resolved, backend can now start properly
Code fully refactored
… router from 900+ to 585 lines, removed circuit breakers and health monitoring, kept essential functionality
…gs to align with simplified architecture
## **Files Changed** - `backend/routers/openglass.py` (simplified 900+ → 585 lines) - `backend/database/redis_db.py` - `backend/utils/conversations/process_conversation.py` (simplified session management) - `backend/main.py` (OmiGlass router registration) - `backend/utils/other/storage.py` (dedicated OmiGlass storage) - `backend/routers/chat.py` (separated concerns) - `app/lib/providers/device_provider.dart` (correct endpoint usage)
…r app to use 'omiglass_' prefix instead of 'openglass_' to match backend validation - Fix indentation error in chat.py exception handling block
…ve utility functions from routers to utils, add Redis helpers, remove direct Redis access, include conversations.py updates
…eConversation - Prevents AttributeError when processing external integration conversations - Applies getattr() to 5 locations in _get_structured and _get_conversation_obj - Minimal fix ensuring ExternalIntegrationCreateConversation processes without crashes
…eConversation processes without crashes (BasedHardware#2540)
…g guide documentation, update firmware.ino, add build scripts and UF2 binary, update source configuration files
@cyhuman is attempting to deploy a commit to the kodjima33's projects Team on Vercel. A member of the Team first needs to authorize it. |
since i can’t run the firmware’s functionality with the bare xiao esp32s3 board i have, please:
btw, the omi app now supports the omiglass (tested with my board) - check the latest TestFlight / internal test builds. PR: #2580 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…