Stream live microphone audio to AssemblyAI for real-time cloud transcription using decibri and the official AssemblyAI SDK.
This integration captures live audio from your microphone using decibri and streams it to AssemblyAI's cloud API over a WebSocket. Transcription results return in real-time using a turn-based model, where speech is grouped into natural segments with partial and final results for each turn. There is no model download, no local inference, and no format conversion required.
Choose this when you need turn-based transcription with speech understanding features, keyterm prompting support, or EU data residency. For a free-tier cloud option, see Deepgram. For use cases where audio must stay on your device, see the sherpa-onnx or whisper.cpp local integrations.
streaming.eu.assemblyai.com for data residency requirements. If your use case requires audio to stay entirely on-device, use the local integrations: sherpa-onnx (real-time streaming) or whisper.cpp (batch transcription).
.env file in your project root:ASSEMBLYAI_API_KEY=your_key_here
The dotenv package loads your API key from the .env file. If you set environment variables another way, you can skip it.
No model download is required. All processing happens in AssemblyAI's cloud.
Import decibri, the AssemblyAI SDK, and dotenv. Create a client with your API key.
require('dotenv').config();
const Decibri = require('decibri');
const { AssemblyAI } = require('assemblyai');
const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });
Create a streaming transcriber with audio parameters that match decibri's configuration. The speechModel option is required and has no default. Omitting it will cause the connection to fail.
const transcriber = client.streaming.transcriber({
speechModel: 'u3-rt-pro',
sampleRate: 16_000,
});
Register event handlers before calling connect(). This ensures no events are missed during the connection handshake.
transcriber.on('open', ({ id }) => {
console.log('Session:', id);
});
transcriber.on('turn', (turn) => {
if (turn.transcript) {
console.log(turn.transcript);
}
});
transcriber.on('error', (err) => {
console.error('AssemblyAI error:', err);
});
transcriber.on('close', (code, reason) => {
console.log('Connection closed:', code, reason);
});
Connect to AssemblyAI, then start the microphone. Audio must only be sent after connect() resolves.
await transcriber.connect();
const mic = new Decibri({ sampleRate: 16000, channels: 1 });
Send each audio chunk directly to AssemblyAI. No format conversion is needed. decibri's raw Int16 PCM Buffer is sent as-is via sendAudio().
mic.on('data', (chunk) => {
transcriber.sendAudio(chunk);
});
AssemblyAI groups speech into turns, which are natural segments of speech separated by pauses. Each turn emits multiple events as audio is processed:
turn.end_of_turn === false means the result is partial and still being refinedturn.end_of_turn === true means the turn is complete with a final transcriptturn.turn_order increments with each new turn (starting from 0)turn.utterance is empty for partial results and contains the final text for completed turnsTo show only final results, filter on end_of_turn:
transcriber.on('turn', (turn) => {
if (turn.end_of_turn && turn.transcript) {
console.log(turn.transcript);
}
});
Stop the microphone and close the AssemblyAI connection when the user presses Ctrl+C.
process.on('SIGINT', async () => {
console.log('\nStopping...');
mic.stop();
await transcriber.close();
process.exit(0);
});
'use strict';
require('dotenv').config();
const Decibri = require('decibri');
const { AssemblyAI } = require('assemblyai');
const run = async () => {
const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });
const transcriber = client.streaming.transcriber({
speechModel: 'u3-rt-pro',
sampleRate: 16_000,
});
transcriber.on('open', ({ id }) => {
console.log('AssemblyAI connected. Session:', id);
});
transcriber.on('turn', (turn) => {
if (turn.end_of_turn && turn.transcript) {
console.log(turn.transcript);
}
});
transcriber.on('error', (err) => {
console.error('AssemblyAI error:', err);
});
transcriber.on('close', (code, reason) => {
console.log('Connection closed:', code, reason);
});
await transcriber.connect();
const mic = new Decibri({ sampleRate: 16000, channels: 1 });
mic.on('data', (chunk) => {
transcriber.sendAudio(chunk);
});
mic.on('error', (err) => {
console.error('Mic error:', err.message);
});
process.on('SIGINT', async () => {
console.log('\nStopping...');
mic.stop();
await transcriber.close();
process.exit(0);
});
console.log('Listening... (Ctrl+C to stop)\n');
};
run().catch(console.error);
The transcriber options control how AssemblyAI processes your audio. Here are the key ones:
| Option | Value | Description |
|---|---|---|
speechModel |
'u3-rt-pro' |
Required. Universal-3 Pro Streaming model for highest accuracy. |
sampleRate |
16000 |
Must match decibri's sample rate. |
Additional options such as keyterm prompting and speaker diarization are available. See the AssemblyAI streaming documentation for the complete list.