Speech2Touch 🗣️👆

"Franke, I'll have a double espresso"

Speech2Touch converts voice input to touch output. Originally designed to bring Voice Control to the Franke A600 coffee machine.

Based on STM32WB55, it leverages Picovoice to process speech, and translates it into custom USB HID packets, simulating touchscreen input.

The INMP441 MEMS microphone is used for voice input.

Pre-built firmware: Download latest (select latest successful run → Artifacts)

Coverage: This project has been written about on Hackaday and Hackster.io.

☕️ Demo

S2T_V1_DEMO.mp4

🤖 Prototype Hardware

The shape and orientation of the protoboard were dictated by the position of the USB ports of the Franke A600.

🏭 Hardware-In-Loop (HIL) Test

S2T_HIL_Demo.mp4

📦 Getting Started

Prerequisites

STM32WB55 USB Dongle dev board
INMP441 microphone
Franke A600 (or compatible) touchscreen device
Qt (for HIL testing)
VSCode with STM32Cube extension
See Dockerfile for toolchain and package requirements

🚀 Container Build & Flash (Recommended)

Open in VSCode and reopen in dev container (F1 → "Dev Containers: Reopen in Container")
Configure and build:

cmake -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_TOOLCHAIN_FILE=/workspaces/Speech2Touch/cmake/gcc-arm-none-eabi.cmake \
  -S /workspaces/Speech2Touch -B /workspaces/Speech2Touch/build/Release -G Ninja
cmake --build /workspaces/Speech2Touch/build/Release --target all

Flash build/Release/Speech2Touch.bin to your STM32WB55

🛠️ Manual Build & Flash

Clone this repository
Set up the project in VSCode using the STM32Cube extension
Build and flash the firmware from VSCode
Connect the device to the coffee machine via USB

🧪 HIL Testing

The Hardware-In-Loop test suite uses a Qt GUI to emulate the Franke A600 touchscreen layout. It leverages Linux text-to-speech utilities to trigger the device and validates that commands activate the correct touch targets.

Build the Qt test suite:

cmake -DCMAKE_BUILD_TYPE=Test -S Speech2Touch -B Speech2Touch/build/Test -G Ninja
cmake --build Speech2Touch/build/Test --target all

Connect the embedded device with the latest firmware to the host PC USB port
Use dmesg to identify the /dev/input/eventX USB input device path
Run the automated test:

./build/Test/Test/hil/runner/test_full_loop --input /dev/input/event10

🏗️ Architecture Overview

[INMP441 Microphone] → [Picovoice Speech Recognition] → [STM32WB55 MCU] → [Custom USB HID] → [Touchscreen Device]

Input: INMP441 microphone captures audio
Processing: Picovoice library performs speech recognition and command extraction
Translation: Commands are mapped to touchscreen coordinates
Output: Custom USB HID packets simulate touch events

🧵 Threading

sequenceDiagram
  participant INMP441 as INMP441 Microphone
  participant Audio as Audio Thread
  participant Speech as Speech Thread
  participant Touch as Touch Thread
  participant USB as USB Output

  par Audio Capture
    INMP441->>Audio: DMA
  and 
    Audio->>Audio: Process Audio
    Audio->>Speech: Audio Buffer
  end

  activate Speech
  Speech->>Speech: Speech Recognition
  Speech->>Speech: Convert to Target Coords
  deactivate Speech

  Speech->>Touch: Touch Coordinates
  Touch->>USB: USB HID Report

🔮 Extending

This project is currently limited to the Franke A600. To modify it for other targets, the following files should be modified:

Touch target files

The following files configure the available touch targets, convert these coordinates into USB HID coordinates, and then trigger the USB HID thread.

Core/Src/touch_targets.c
Core/Inc/touch_targets.h
Core/Src/touch_mapper.c
Core/Inc/touch_mapper.h

Picovoice configuration files

The Picovoice precompiled binary at Core/Lib/picovoice/libpicovoice.a is pulled directly from Picovoice/picovoice repository.

The configuration files in Core/Lib/picovoice/include are specifically set up for a Franke A600, including using "Franke" as the wake-word. New configuration files can be generated from the Picovoice Console.

☑️ Roadmap

Replace Dev Board + Protoboard with a PCB
Unit testing.
Extract audio over RTT for tuning.
Decouple Franke A600-specific functionality for easier adapting of Speech2Touch to different applications.

📜 License

MIT License. See LICENSE.md for details.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
AZURE_RTOS/App		AZURE_RTOS/App
Core		Core
Drivers		Drivers
Middlewares/ST		Middlewares/ST
Test		Test
USBX		USBX
cmake		cmake
utils		utils
.clang-format		.clang-format
.clangd		.clangd
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
LICENSE.md		LICENSE.md
README.md		README.md
Speech2Touch.ioc		Speech2Touch.ioc
startup_stm32wb55xx_cm4.s		startup_stm32wb55xx_cm4.s
stm32wb55xx_flash_cm4.ld		stm32wb55xx_flash_cm4.ld

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech2Touch 🗣️👆

☕️ Demo

🤖 Prototype Hardware

🏭 Hardware-In-Loop (HIL) Test

📦 Getting Started

Prerequisites

🚀 Container Build & Flash (Recommended)

🛠️ Manual Build & Flash

🧪 HIL Testing

🏗️ Architecture Overview

🧵 Threading

🔮 Extending

Touch target files

Picovoice configuration files

☑️ Roadmap

📜 License

About

Uh oh!

Releases 3

Languages

License

edholmes2232/Speech2Touch

Folders and files

Latest commit

History

Repository files navigation

Speech2Touch 🗣️👆

☕️ Demo

🤖 Prototype Hardware

🏭 Hardware-In-Loop (HIL) Test

📦 Getting Started

Prerequisites

🚀 Container Build & Flash (Recommended)

🛠️ Manual Build & Flash

🧪 HIL Testing

🏗️ Architecture Overview

🧵 Threading

🔮 Extending

Touch target files

Picovoice configuration files

☑️ Roadmap

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Languages