# Voice — SDK Methods

> The typed voice surface: Voice, Calls, Call, AudioBridge, Agents, plus the browser softphone helpers.

The voice modality lives in its own subpath. Import the `Voice` client and the
optional browser helpers from it.

  <Tab title="TypeScript">
```ts
import { Voice } from "@clutchcall/sdk/voice";
// browser softphone helpers live on the moqt subpath:
import { captureMicrophone, OpusPlayer } from "@clutchcall/sdk/moqt";
```
  </Tab>
  <Tab title="Python">
```python
from clutchcall.voice import Voice
```
  </Tab>

## `Voice`

The top-level client. It hands out three scoped helpers — `calls`,
`audioBridge`, and `agents` — that carry your org id and the transport.

  <Tab title="TypeScript">
```ts
const v = new Voice({
  baseUrl: "https://engine.clutchcall.dev",   // control-plane origin
  apiKey:  process.env.CLUTCHCALL_CREDENTIALS!,
  orgId:   "org_abc",
});

v.calls;        // control plane
v.audioBridge;  // data plane
v.agents;       // AI-agent attach
```
  </Tab>
  <Tab title="Python">
```python
import os

v = Voice(
    base_url="https://engine.clutchcall.dev",
    api_key=os.environ["CLUTCHCALL_CREDENTIALS"],
    org_id="org_abc",
)

v.calls         # control plane
v.audio_bridge  # data plane
v.agents        # AI-agent attach
```
  </Tab>

<ParamField path="baseUrl" type="string" required>
  Control-plane origin (no trailing slash). Calls go over HTTPS with
  `Authorization: Bearer <apiKey>`.
</ParamField>
<ParamField path="apiKey" type="string" required>
  API key with `voice:*` scopes.
</ParamField>
<ParamField path="orgId" type="string" required>
  Default org / tenant id applied to every call.
</ParamField>
<ParamField path="relayHost" type="string">
  Relay hostname for the audio bridge. Defaults to `relay.clutchcall.dev`.
</ParamField>
<ParamField path="fetch" type="function">
  Override the `fetch` implementation (TypeScript only).
</ParamField>
<ParamField path="webTransport" type="WebTransportFactory">
  Inject a custom WebTransport factory for the audio bridge (TypeScript, for
  Node tests).
</ParamField>

## `Calls` — control plane

Reachable as `v.calls`. Originates calls and fetches their current state.

### `calls.originate(args) → Promise<Call>`

Place an outbound call over a SIP trunk. Returns a `Call` handle once the call
is `dialing`.

  <Tab title="TypeScript">
```ts
const call = await v.calls.originate({
  to:      "+15551234567",
  from:    "+15558675309",
  trunkId: "trunk_main",
  agent:   "healthcare-assistant",  // optional: engine attaches the bridge on answer
  ringTimeoutSec: 30,               // optional: server-clamped 5..120
});
console.log(call.sid, call.status);
```
  </Tab>
  <Tab title="Python">
```python
call = v.calls.originate(
    to="+15551234567",
    from_="+15558675309",
    trunk_id="trunk_main",
    agent="healthcare-assistant",   # optional
    ring_timeout_sec=30,            # optional, clamped 5..120
)
print(call.sid, call.status)
```
  </Tab>

<ParamField path="to" type="string" required>E.164 destination.</ParamField>
<ParamField path="from" type="string" required>Caller-id, E.164.</ParamField>
<ParamField path="trunkId" type="string" required>SIP trunk to route over.</ParamField>
<ParamField path="agent" type="string">AI agent id to attach automatically on answer.</ParamField>
<ParamField path="ringTimeoutSec" type="number">Seconds to ring before giving up. Server-clamped 5..120, default 30.</ParamField>

### `calls.get({ sid }) → Promise<Call>`

Fetch the current state of a call by its `sid`. Use it to read status after
`originate()`.

  <Tab title="TypeScript">
```ts
const call = await v.calls.get({ sid });
```
  </Tab>
  <Tab title="Python">
```python
call = v.calls.get(sid=sid)
```
  </Tab>

## `Call` — one call handle

Returned by `originate()` and `get()`. Read-only fields plus two mutating
methods.

| Field       | Type         | Notes                                               |
| ----------- | ------------ | --------------------------------------------------- |
| `sid`       | `string`     | The only identifier you need to address audio.      |
| `status`    | `CallStatus` | `dialing` … `completed` (see below).                |
| `to`        | `string`     | E.164 destination.                                  |
| `from`      | `string`     | Caller-id (`from_` in Python).                      |
| `startedAt` | `string`     | ISO-8601 start time.                                |
| `trunkId`   | `string?`    | SIP trunk (`trunk_id` in Python).                   |
| `agent`     | `string?`    | Attached AI agent id, if any.                       |

### `call.transfer(args | string) → Promise<void>`

Transfer to a PSTN number **or** re-attach a different agent. Provide exactly
one of `to` / `agent`. Passing a bare string is shorthand for `{ to }`.

  <Tab title="TypeScript">
```ts
await call.transfer({ to: "+15557654321" });   // forward to PSTN
await call.transfer("+15557654321");           // shorthand
await call.transfer({ agent: "billing-bot" }); // re-attach an agent
```
  </Tab>
  <Tab title="Python">
```python
call.transfer(to="+15557654321")     # forward to PSTN
call.transfer(agent="billing-bot")   # re-attach an agent
```
  </Tab>

### `call.hangup() → Promise<void>`

End the call and drop both audio tracks.

  <Tab title="TypeScript">
```ts
await call.hangup();
```
  </Tab>
  <Tab title="Python">
```python
call.hangup()
```
  </Tab>

## `AudioBridge` — data plane

Reachable as `v.audioBridge` (`v.audio_bridge` in Python). It opens the
bidirectional audio bridge for one call and hands back a handle you hold for the
call's lifetime.

### `audioBridge.attach(callSid, opts) → Promise<AudioBridge>`

Open the bridge **as the server**: subscribe the caller's audio (uplink),
publish audio back (downlink). Pass `onUplink` to receive caller frames.

  <Tab title="TypeScript">
```ts
const bridge = await v.audioBridge.attach(call.sid, {
  codec: "opus",          // default
  sampleRate: 48000,      // default
  channels: 1,            // default
  frameMs: 20,            // default
  onUplink: (frame, tsUs) => myAsr.feed(frame),
});

// push synthesized audio back to the caller:
myTts.onChunk((opus) => bridge.publishDownlink(opus));
```
  </Tab>
  <Tab title="Python">
```python
def on_uplink(frame: bytes, ts_us: int) -> None:
    asr.feed(frame)

bridge = v.audio_bridge.attach(
    call.sid,
    on_uplink=on_uplink,
    codec="opus",
    sample_rate=48000,
    channels=1,
    frame_ms=20,
)

tts.on_chunk(lambda opus: bridge.publish_downlink(opus))
```
  </Tab>

<ParamField path="onUplink" type="(frame, timestampUs) => void" required>
  Callback for inbound caller audio. One encoded frame per call.
</ParamField>
<ParamField path="codec" type="AudioCodec">Default `opus`.</ParamField>
<ParamField path="sampleRate" type="number">Default 48000.</ParamField>
<ParamField path="channels" type="number">Default 1.</ParamField>
<ParamField path="frameMs" type="number">Frame duration in ms. Default 20.</ParamField>

### `audioBridge.attachCaller(callSid, opts) → Promise<AudioBridge>`

The **browser-caller** mirror of `attach()`: subscribe downlink (cloud → caller),
publish uplink (mic → cloud). Pass `onDownlink` to receive audio for playback.
(TypeScript; this is the softphone path.)

```ts
const bridge = await v.audioBridge.attachCaller(call.sid, {
  codec: "opus",
  onDownlink: (frame, tsUs) => player.push(tsUs, frame),
});
```

### `AudioBridge` methods

| Method                                          | Direction          | Notes                                                       |
| ----------------------------------------------- | ------------------ | ----------------------------------------------------------- |
| `publishDownlink(frame, timestampUs?)`          | server → caller    | Push one encoded frame to the caller (e.g. 20 ms Opus).     |
| `publishUplink(frame, timestampUs?)`            | caller → cloud     | Browser-caller side; push one encoded mic frame.            |
| `onUplink(cb)`                                  | —                  | Swap the inbound-caller consumer mid-call.                  |
| `onDownlink(cb)`                                | —                  | Browser-caller side; swap the inbound-cloud consumer.       |
| `close()`                                       | —                  | Tear down both tracks and the underlying MoQT session.      |
| `callSid`                                        | —                  | The `sid` this bridge belongs to.                           |

`timestampUs` is optional; if omitted the SDK stamps a monotonic microsecond
clock. In Python the bridge exposes `publish_downlink(frame, timestamp_us=…)`
and `close()`.

## `Agents` — AI-agent attach

Reachable as `v.agents`. Bind a server-side speech-to-speech agent to a live
call; the engine wires the audio bridge end-to-end, so you do **not** open an
`AudioBridge` yourself.

### `agents.attach(callSid, agent) → Promise<void>`

  <Tab title="TypeScript">
```ts
await v.agents.attach(call.sid, "healthcare-assistant");
```
  </Tab>
  <Tab title="Python">
```python
v.agents.attach(call.sid, "healthcare-assistant")
```
  </Tab>

> **TIP:**
> You can attach an agent two ways: pass `agent` to `originate()` (attached on
> answer) or call `agents.attach()` against a live `sid` (e.g. to hand a call from
> a human to a bot, or swap one bot for another).

## Browser softphone helpers

These ship on the `moqt` subpath and turn a microphone into encoded Opus and
play encoded Opus back, so the softphone never touches a WebRTC transport for
media.

### `captureMicrophone(publication, opts?) → Promise<MicCapture>`

Capture the mic, run AEC / AGC / noise-suppression, Opus-encode, and write each
**encoded** frame to a MoQT audio publication. The returned `stop()` tears down
the capture graph (it does not close the publication). The simplest softphone
loop forwards captured frames into the bridge with `publishUplink`:

```ts
const bridge = await v.audioBridge.attachCaller(call.sid, {
  codec: "opus",
  onDownlink: (frame, tsUs) => player.push(tsUs, frame),
});

// forward each captured mic frame onto the uplink track
const mic = await captureMicrophone(
  { write: (tsUs, frame) => bridge.publishUplink(frame, tsUs) } as any,
  { audioConstraints: { echoCancellation: true, autoGainControl: true, noiseSuppression: true } },
);
// later:
mic.stop();
```

> **NOTE:**
> `captureMicrophone` writes onto an `AudioPublication`. The snippet above adapts
> it to the bridge's `publishUplink`; if you work with `MoqtClient` directly you
> pass the publication returned by `client.publishAudio(...)`.

### `OpusPlayer`

Decode received Opus with WebCodecs and render through an AudioWorklet ring
buffer (silence-padded on underrun). Construct with an `AudioContext`, `start()`
once, then `push()` each frame you receive.

```ts
import { OpusPlayer } from "@clutchcall/sdk/moqt";

const ctx = new AudioContext();
const player = new OpusPlayer(ctx, { sampleRate: 48000, channels: 1 });
await player.start();

// feed it the frames from the downlink subscription:
// onDownlink: (frame, tsUs) => player.push(tsUs, frame)

player.close();
```

## Types and events

| Type         | Values                                                                |
| ------------ | --------------------------------------------------------------------- |
| `CallStatus` | `dialing` \| `ringing` \| `in_progress` \| `completed` \| `failed` \| `no_answer` |
| `AudioCodec` | `opus` \| `pcm16` \| `g711_ulaw` \| `g711_alaw`                        |

The control-plane surface is request/response — there is no socket of call
events. To track a call's lifecycle, re-fetch with `calls.get({ sid })`. The
**data plane** is event-driven: the `onUplink` / `onDownlink` callbacks fire per
audio frame, and the underlying MoQT session auto-reconnects and replays the
publish/subscribe on link loss.

> **WARNING:**
> Hold the `AudioBridge` handle (and, in Python, any subscription it owns) for the
> whole call. If it is garbage-collected, the engine calls into freed memory and
> both tracks go silent. Call `close()` explicitly when the call ends.

## Other languages

The Go, Rust, Java, and C# SDKs expose the same publish/subscribe primitives via
`MoqtClient` on the `voice/<sid>/{uplink,downlink}` tracks; the typed
`Calls`/`AudioBridge`/`Agents` convenience wrappers are TypeScript-first with
Python parity. See [Realtime Tracks](/concepts/realtime-tracks) for the raw
`publishAudio` / `subscribeAudio` surface in every language.
