GuideSense

🏆 1st Place at Qualcomm x Microsoft x Northeastern University On-Device AI Builders Hackathon

Introducing GuideSense — sensing obstacles, speaking solutions — your personal navigation companion.

Overview

GuideSense is a wheelchair navigation assistant that uses computer vision and voice control to provide real-time guidance and feedback. Our system achieves exceptional performance:

YOLOv8n object detection: <40ms inference on Snapdragon X Elite CPU
OpenAI Whisper voice interface: <10ms inference on NPU via Qualcomm AI Engine Direct SDK with ONNX Runtime QNN

✨ Features

Real-Time Object Detection: Utilizes YOLO to detect objects in the environment and assess potential obstacles
Audio Feedback: Provides concise audio feedback about surroundings, including object type and distance
Voice Activation: Allows users to activate the system with voice commands ("Go" to start)
Responsive Feedback: Interrupts ongoing audio to provide immediate updates about critical obstacles
On-Device Processing: End-to-end processing with zero cloud dependency for privacy and minimal latency
Real-Time Depth Estimation: Calculates precise distances based on YOLO

🛠️ Setup Instructions

Prerequisites

Python 3.11
Working webcam and microphone
Pyenv for managing Python versions (optional but recommended)
Install Qualcomm AI Engine Direct SDK
Download Whisper-Base-En ONNX model

Installation

Clone the Repository:

git clone https://github.com/Hackathon-Team-404/GuideSense.git
cd GuideSense

Set Up Python Environment:

pyenv install 3.11.11
pyenv local 3.11.11

Create a Virtual Environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Dependencies:

pip install -r requirements.txt

Set Up API Keys:

Create a .env file in the root directory:

OPENAI_API_KEY=your_openai_api_key_here
XAI_API_KEY=your_grok_api_key

Note: For basic functionality without LLM features, you can use an empty key: OPENAI_API_KEY=""

🚀 Usage

Run the Application:

python main.py

Voice Commands:
- Say "Go" to activate the system
Quit the Application:
- Press 'q' in the video window to quit

📁 Project Structure

recognition/
├── main.py                 # Main script to run the application
├── situation_analyzer.py   # Analyzes detected objects and provides guidance
├── audio_feedback.py       # Handles audio feedback and text-to-speech
├── voice_control.py        # Manages voice commands for activation and stopping
├── object_detector.py      # Implements YOLO-based object detection
├── requirements.txt        # Lists all Python dependencies
└── .env                    # Contains API keys (excluded from version control)

⚡ Performance Metrics

Component	Performance	Hardware
YOLOv8n Object Detection	< 40ms inference time	Snapdragon X Elite CPU
OpenAI Whisper	< 10ms inference	Qualcomm NPU via AI Engine Direct SDK with ONNX Runtime QNN
System	End-to-end on-device processing	Zero cloud dependency

🔮 Future Work

Integration with distance sensors (ultrasound, IR) for enhanced spatial awareness
Implementation of SLAM (Simultaneous Localization and Mapping) for improved navigation

👥 Team

Name	LinkedIn
Tianyu Fang	LinkedIn
Anson He	LinkedIn
Dingyang Jin	LinkedIn
Hao Wu	LinkedIn
Harshil Chudasama	LinkedIn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GuideSense

Overview

✨ Features

🛠️ Setup Instructions

Prerequisites

Installation

🚀 Usage

📁 Project Structure

⚡ Performance Metrics

🔮 Future Work

👥 Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_feedback.py		audio_feedback.py
demo.gif		demo.gif
main.py		main.py
object_detector.py		object_detector.py
requirements.txt		requirements.txt
situation_analyzer.py		situation_analyzer.py
voice_control.py		voice_control.py
yolov8n.pt		yolov8n.pt

License

Hackathon-Team-404/GuideSense

Folders and files

Latest commit

History

Repository files navigation

GuideSense

Overview

✨ Features

🛠️ Setup Instructions

Prerequisites

Installation

🚀 Usage

📁 Project Structure

⚡ Performance Metrics

🔮 Future Work

👥 Team

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages