Choosing an Integration

decibri captures audio. These integrations process it. Use the table below to find the right one for your use case.

Comparison

Use case	Integration	Latency	Cost	Offline
Real-time local STT	Sherpa-ONNX	Low	Free	Yes
High-accuracy local STT	Whisper.cpp	Medium	Free	Yes
Wake word detection	Sherpa-ONNX KWS	Low	Free	Yes
Voice activity detection	Sherpa-ONNX VAD	Low	Free	Yes
Real-time cloud STT	Deepgram	Low	Pay-per-use (free tier)	No
Real-time cloud STT	AssemblyAI	Low	Pay-per-use	No
Real-time cloud STT	OpenAI Realtime	Low	Pay-per-use	No

Local vs cloud

Local integrations (Sherpa-ONNX, Whisper.cpp) run entirely on-device. No API key, no network, no usage fees. Audio never leaves the machine. Trade-off: you supply the compute and manage the model files.

Cloud integrations (Deepgram, AssemblyAI, OpenAI) stream audio to an external API. Higher accuracy on some benchmarks, no local GPU required, and managed model updates. Trade-off: requires an API key, network connectivity, and incurs per-use costs.

Integration guides

Local (no API key, no cloud)

Whisper.cpp

Speech-to-Text

Local transcription with OpenAI's Whisper model

Cloud (API key required)

Deepgram

Real-Time Transcription

Cloud speech-to-text with Deepgram's Nova-3 model

AssemblyAI

Real-Time Transcription

Cloud speech-to-text with AssemblyAI's Universal-3 Pro model

OpenAI

Real-Time Transcription

Stream audio to OpenAI's Realtime API for speech-to-text

Choosing an Integration

Comparison

Local vs cloud

Integration guides

Local (no API key, no cloud)

Sherpa-ONNX

Speech-to-Text

Keyword Spotting

Voice Activity Detection

Whisper.cpp

Speech-to-Text

Cloud (API key required)

Deepgram

Real-Time Transcription

AssemblyAI

Real-Time Transcription

OpenAI

Real-Time Transcription