Voice — SDK Methods

The voice modality lives in its own subpath. Import the Voice client and the optional browser helpers from it.

TypeScript
Python

import { Voice } from "@clutchcall/sdk/voice";
// browser softphone helpers live on the moqt subpath:
import { captureMicrophone, OpusPlayer } from "@clutchcall/sdk/moqt";

from clutchcall.voice import Voice

`Voice`

The top-level client. It hands out three scoped helpers — calls, audioBridge, and agents — that carry your org id and the transport.

TypeScript
Python

const v = new Voice({
  baseUrl: "https://engine.clutchcall.dev",   // control-plane origin
  apiKey:  process.env.CLUTCHCALL_CREDENTIALS!,
  orgId:   "org_abc",
});

v.calls;        // control plane
v.audioBridge;  // data plane
v.agents;       // AI-agent attach

import os

v = Voice(
    base_url="https://engine.clutchcall.dev",
    api_key=os.environ["CLUTCHCALL_CREDENTIALS"],
    org_id="org_abc",
)

v.calls         # control plane
v.audio_bridge  # data plane
v.agents        # AI-agent attach

baseUrl

string

required

Control-plane origin (no trailing slash). Calls go over HTTPS with Authorization: Bearer <apiKey>.

apiKey

string

required

API key with voice:* scopes.

orgId

string

required

Default org / tenant id applied to every call.

relayHost

string

Relay hostname for the audio bridge. Defaults to relay.clutchcall.dev.

fetch

function

Override the fetch implementation (TypeScript only).

webTransport

WebTransportFactory

Inject a custom WebTransport factory for the audio bridge (TypeScript, for Node tests).

`Calls` — control plane

Reachable as v.calls. Originates calls and fetches their current state.

`calls.originate(args) → Promise<Call>`

Place an outbound call over a SIP trunk. Returns a Call handle once the call is dialing.

TypeScript
Python

const call = await v.calls.originate({
  to:      "+15551234567",
  from:    "+15558675309",
  trunkId: "trunk_main",
  agent:   "healthcare-assistant",  // optional: engine attaches the bridge on answer
  ringTimeoutSec: 30,               // optional: server-clamped 5..120
});
console.log(call.sid, call.status);

call = v.calls.originate(
    to="+15551234567",
    from_="+15558675309",
    trunk_id="trunk_main",
    agent="healthcare-assistant",   # optional
    ring_timeout_sec=30,            # optional, clamped 5..120
)
print(call.sid, call.status)

string

required

E.164 destination.

from

string

required

Caller-id, E.164.

trunkId

string

required

SIP trunk to route over.

agent

string

AI agent id to attach automatically on answer.

ringTimeoutSec

number

Seconds to ring before giving up. Server-clamped 5..120, default 30.

`calls.get({ sid }) → Promise<Call>`

Fetch the current state of a call by its sid. Use it to read status after originate().

TypeScript
Python

const call = await v.calls.get({ sid });

call = v.calls.get(sid=sid)

`Call` — one call handle

Returned by originate() and get(). Read-only fields plus two mutating methods.

Field	Type	Notes
`sid`	`string`	The only identifier you need to address audio.
`status`	`CallStatus`	`dialing` … `completed` (see below).
`to`	`string`	E.164 destination.
`from`	`string`	Caller-id (`from_` in Python).
`startedAt`	`string`	ISO-8601 start time.
`trunkId`	`string?`	SIP trunk (`trunk_id` in Python).
`agent`	`string?`	Attached AI agent id, if any.

`call.transfer(args | string) → Promise<void>`

Transfer to a PSTN number or re-attach a different agent. Provide exactly one of to / agent. Passing a bare string is shorthand for { to }.

TypeScript
Python

await call.transfer({ to: "+15557654321" });   // forward to PSTN
await call.transfer("+15557654321");           // shorthand
await call.transfer({ agent: "billing-bot" }); // re-attach an agent

call.transfer(to="+15557654321")     # forward to PSTN
call.transfer(agent="billing-bot")   # re-attach an agent

`call.hangup() → Promise<void>`

End the call and drop both audio tracks.

TypeScript
Python

await call.hangup();

call.hangup()

`AudioBridge` — data plane

Reachable as v.audioBridge (v.audio_bridge in Python). It opens the bidirectional audio bridge for one call and hands back a handle you hold for the call’s lifetime.

`audioBridge.attach(callSid, opts) → Promise<AudioBridge>`

Open the bridge as the server: subscribe the caller’s audio (uplink), publish audio back (downlink). Pass onUplink to receive caller frames.

TypeScript
Python

const bridge = await v.audioBridge.attach(call.sid, {
  codec: "opus",          // default
  sampleRate: 48000,      // default
  channels: 1,            // default
  frameMs: 20,            // default
  onUplink: (frame, tsUs) => myAsr.feed(frame),
});

// push synthesized audio back to the caller:
myTts.onChunk((opus) => bridge.publishDownlink(opus));

def on_uplink(frame: bytes, ts_us: int) -> None:
    asr.feed(frame)

bridge = v.audio_bridge.attach(
    call.sid,
    on_uplink=on_uplink,
    codec="opus",
    sample_rate=48000,
    channels=1,
    frame_ms=20,
)

tts.on_chunk(lambda opus: bridge.publish_downlink(opus))

onUplink

(frame, timestampUs) => void

required

Callback for inbound caller audio. One encoded frame per call.

codec

AudioCodec

Default opus.

sampleRate

number

Default 48000.

channels

number

Default 1.

frameMs

number

Frame duration in ms. Default 20.

`audioBridge.attachCaller(callSid, opts) → Promise<AudioBridge>`

The browser-caller mirror of attach(): subscribe downlink (cloud → caller), publish uplink (mic → cloud). Pass onDownlink to receive audio for playback. (TypeScript; this is the softphone path.)

const bridge = await v.audioBridge.attachCaller(call.sid, {
  codec: "opus",
  onDownlink: (frame, tsUs) => player.push(tsUs, frame),
});

`AudioBridge` methods

Method	Direction	Notes
`publishDownlink(frame, timestampUs?)`	server → caller	Push one encoded frame to the caller (e.g. 20 ms Opus).
`publishUplink(frame, timestampUs?)`	caller → cloud	Browser-caller side; push one encoded mic frame.
`onUplink(cb)`	—	Swap the inbound-caller consumer mid-call.
`onDownlink(cb)`	—	Browser-caller side; swap the inbound-cloud consumer.
`close()`	—	Tear down both tracks and the underlying MoQT session.
`callSid`	—	The `sid` this bridge belongs to.

timestampUs is optional; if omitted the SDK stamps a monotonic microsecond clock. In Python the bridge exposes publish_downlink(frame, timestamp_us=…) and close().

`Agents` — AI-agent attach

Reachable as v.agents. Bind a server-side speech-to-speech agent to a live call; the engine wires the audio bridge end-to-end, so you do not open an AudioBridge yourself.

`agents.attach(callSid, agent) → Promise<void>`

TypeScript
Python

await v.agents.attach(call.sid, "healthcare-assistant");

v.agents.attach(call.sid, "healthcare-assistant")

You can attach an agent two ways: pass agent to originate() (attached on answer) or call agents.attach() against a live sid (e.g. to hand a call from a human to a bot, or swap one bot for another).

Browser softphone helpers

These ship on the moqt subpath and turn a microphone into encoded Opus and play encoded Opus back, so the softphone never touches a WebRTC transport for media.

`captureMicrophone(publication, opts?) → Promise<MicCapture>`

Capture the mic, run AEC / AGC / noise-suppression, Opus-encode, and write each encoded frame to a MoQT audio publication. The returned stop() tears down the capture graph (it does not close the publication). The simplest softphone loop forwards captured frames into the bridge with publishUplink:

const bridge = await v.audioBridge.attachCaller(call.sid, {
  codec: "opus",
  onDownlink: (frame, tsUs) => player.push(tsUs, frame),
});

// forward each captured mic frame onto the uplink track
const mic = await captureMicrophone(
  { write: (tsUs, frame) => bridge.publishUplink(frame, tsUs) } as any,
  { audioConstraints: { echoCancellation: true, autoGainControl: true, noiseSuppression: true } },
);
// later:
mic.stop();

captureMicrophone writes onto an AudioPublication. The snippet above adapts it to the bridge’s publishUplink; if you work with MoqtClient directly you pass the publication returned by client.publishAudio(...).

`OpusPlayer`

Decode received Opus with WebCodecs and render through an AudioWorklet ring buffer (silence-padded on underrun). Construct with an AudioContext, start() once, then push() each frame you receive.

import { OpusPlayer } from "@clutchcall/sdk/moqt";

const ctx = new AudioContext();
const player = new OpusPlayer(ctx, { sampleRate: 48000, channels: 1 });
await player.start();

// feed it the frames from the downlink subscription:
// onDownlink: (frame, tsUs) => player.push(tsUs, frame)

player.close();

Types and events

Type	Values
`CallStatus`	`dialing` \| `ringing` \| `in_progress` \| `completed` \| `failed` \| `no_answer`
`AudioCodec`	`opus` \| `pcm16` \| `g711_ulaw` \| `g711_alaw`

The control-plane surface is request/response — there is no socket of call events. To track a call’s lifecycle, re-fetch with calls.get({ sid }). The data plane is event-driven: the onUplink / onDownlink callbacks fire per audio frame, and the underlying MoQT session auto-reconnects and replays the publish/subscribe on link loss.

Hold the AudioBridge handle (and, in Python, any subscription it owns) for the whole call. If it is garbage-collected, the engine calls into freed memory and both tracks go silent. Call close() explicitly when the call ends.

Other languages

The Go, Rust, Java, and C# SDKs expose the same publish/subscribe primitives via MoqtClient on the voice/<sid>/{uplink,downlink} tracks; the typed Calls/AudioBridge/Agents convenience wrappers are TypeScript-first with Python parity. See Realtime Tracks for the raw publishAudio / subscribeAudio surface in every language.

​Voice

​Calls — control plane

​calls.originate(args) → Promise<Call>

​calls.get({ sid }) → Promise<Call>

​Call — one call handle

​call.transfer(args | string) → Promise<void>

​call.hangup() → Promise<void>

​AudioBridge — data plane

​audioBridge.attach(callSid, opts) → Promise<AudioBridge>

​audioBridge.attachCaller(callSid, opts) → Promise<AudioBridge>

​AudioBridge methods

​Agents — AI-agent attach

​agents.attach(callSid, agent) → Promise<void>

​Browser softphone helpers

​captureMicrophone(publication, opts?) → Promise<MicCapture>

​OpusPlayer

​Types and events

​Other languages

`Voice`

`Calls` — control plane

`calls.originate(args) → Promise<Call>`

`calls.get({ sid }) → Promise<Call>`

`Call` — one call handle

`call.transfer(args | string) → Promise<void>`

`call.hangup() → Promise<void>`

`AudioBridge` — data plane

`audioBridge.attach(callSid, opts) → Promise<AudioBridge>`

`audioBridge.attachCaller(callSid, opts) → Promise<AudioBridge>`

`AudioBridge` methods

`Agents` — AI-agent attach

`agents.attach(callSid, agent) → Promise<void>`

Browser softphone helpers

`captureMicrophone(publication, opts?) → Promise<MicCapture>`

`OpusPlayer`

Types and events

Other languages