Skip to content

RealtimeAudioClient

Interface for realtime voice AI model connections.

Lifecycle:

  1. connect(config) — establish WebSocket, send initial config
  2. sendAudio(frame) — stream user audio to model
  3. on(‘tool-call’, …) — handle tool calls from model
  4. sendToolResponse(…) — return tool results
  5. on(‘audio’, …) — receive model audio output
  6. updateConfig(…) — change prompt/tools mid-session (e.g., on flow transition)
  7. disconnect() — close cleanly

NOTE: For SDK-based implementations (e.g. @google/genai), session.receive() is an async iterator that completes after each turn and must be re-entered inside a while(true) loop. Raw-wire implementations (fetch()+Upgrade on Cloudflare Workers, direct ws WebSocket usage) do not have this quirk; the underlying WebSocket stays open for the session lifetime. Future authors porting this interface to a new transport should be aware that the SDK wrapper and the raw-wire path diverge at this seam.

readonly capabilities: RealtimeCapabilities;

Capability flags. Static per implementation; declared at construction. Implementations MUST NOT throw from this accessor.


readonly connected: boolean;

Whether currently connected to the model.


readonly model: string;

Model identifier configured for this client, e.g. "gpt-realtime".


readonly provider: string;

Short stable identifier for the provider, e.g. "gemini", "openai".

connect(config): Promise<void>;

Connect to the AI service with initial configuration.

Parameter Type

config

RealtimeSessionConfig

Promise<void>


disconnect(): Promise<void>;

Disconnect gracefully.

Promise<void>


off<K>(event, handler): void;

Unsubscribe from events.

Type Parameter

K extends keyof RealtimeEventMap

Parameter Type

event

K

handler

RealtimeEventMap[K]

void


on<K>(event, handler): void;

Subscribe to events from the model.

Type Parameter

K extends keyof RealtimeEventMap

Parameter Type

event

K

handler

RealtimeEventMap[K]

void


ping(): Promise<boolean>;

Check connection health (WebSocket ping).

Promise<boolean>


optional requestResponse(instruction?): void;

Trigger a model response after config update. Optional — not all providers support it.

Parameter Type

instruction?

string

void


sendAudio(frame): void;

Send a PCM audio frame to the model.

Parameter Type

frame

Uint8Array

void


sendToolResponse(responses): void;

Send tool call results back to the model.

Parameter Type

responses

RealtimeToolResponse[]

void


updateConfig(config): Promise<void>;

Update session configuration mid-call (system prompt, tools). Used when CapabilityHost signals ‘reconfigure’ after a flow transition.

Implementation may:

  • Send in-place update (if model supports it)
  • Disconnect and reconnect with new config (fallback)
  • Use session resumption handle for continuity
Parameter Type

config

Partial<RealtimeSessionConfig>

Promise<void>