The shape
The substrate: MoQT over QUIC
Every modality reduces to MoQT tracks. A publisher names a track by (namespace, name), attaches a capability tag, and writes frames. The relay fans the track out to every subscriber that matched the namespace + capability. The publisher doesn’t know who’s subscribed; the subscriber doesn’t know who the publisher is.| Track kind | Carries | Wire model |
|---|---|---|
| Audio | Continuous voice (Opus / PCM / G.711) | subgroup stream |
| Video | Encoded video; group per keyframe | subgroup stream |
| Frame | Opaque binary with per-frame priority | subgroup OR datagram (datagram=true) |
| Text | Reliable ordered messages | subgroup stream |
The relay mesh
The relay is the same binary as the engine, with the built-in relay role enabled. It runs shard-per-core: one shard per core, each shard binds the same UDP port withSO_REUSEPORT
and an eBPF program dispatches incoming QUIC packets to the shard that
owns the connection’s destination CID. Each shard:
- accepts QUIC handshakes (ECDSA P-256)
- maintains its share of MoQT sessions
- forwards published frames to subscribers without leaving the shard
- gates each publish/subscribe through the namespace auth hook (JWT verify with namespace-scoped claims)
relay.clutchcall.dev and
by the per-edge POP code (relay-us.clutchcall.dev,
relay-uk.clutchcall.dev, …).
Two ports, two stacks
The relay runs two QUIC stacks on two ports rather than multiplexing on one:| Port | Stack | Carries |
|---|---|---|
| 443 | QUIC · MoQT | MoQT (audio / video / frame / text tracks) |
| 4433 | QUIC · HTTP/3 | HTTP/3 + WebTransport (REST, MCP, signalling) |
Why one core
Two reasons. Identical wire envelopes. The relay only ever sees the same MoQT framing regardless of which SDK published. No per-language parser to keep in sync — only the C++ core, which every SDK imports via FFI / WASM. Audio without a copy. µ-law / A-law to 16-bit PCM conversion is done in SIMD inside the core. Browser (WASM) and Node / Python / Go / Rust / Java / .NET (native FFI) call into the same code path with the same latency profile.Connection lifecycle
- Connect. SDK dials MoQT on
relay.clutchcall.dev:443over QUIC (or WebTransport on browser). Tenant token presented as the first MoQT envelope. - Handshake. The relay’s namespace_auth hook verifies the token and stamps a namespace scope on the session.
- Publish / subscribe. SDK opens MoQT publish and subscribe requests on demand. The relay routes by namespace and capability.
- Auto-reconnect. If the link drops, the SDK reconnects with capped exponential backoff and re-establishes every publication and subscription transparently. Application code doesn’t see it.
Where things run
- Browsers. TypeScript SDK speaks MoQT directly over native WebTransport (no FFI, no custom framing). The C++ core is compiled to WebAssembly via Emscripten for audio APM + framing fast paths.
- Native runtimes. SDK loads
clutchcall_core_ffi.so/.dylib/.dllvia the language’s native loader: JNI (Java), P/Invoke (.NET), CGO (Go),libloading(Rust),ctypes(Python). - Unity. .NET runtime SDK plus a
com.clutchcall.transportUPM package that exposesINetworkInterfaceover the games modality — drop-in for Unity Netcode for GameObjects / Entities. See Netcode (Unity).
Direct-media (voice)
For server-side AI calls (default_app=AI_BIDIRECTIONAL_STREAM), the
voice path uses direct-media between the carrier and the agent
runtime. The gateway negotiates SIP signalling with the carrier, then
publishes the SDP answer pointing at the agent runtime’s RTP socket;
RTP flows straight from the carrier to the runtime. The gateway is
signalling-only on that path.
For SIP/RTP-only calls (no AI bridge), the gateway still terminates RTP
and runs a local VAD. So the same call_sid can take either RTP
path depending on default_app.
Legacy RPC (still supported)
The original control plane was a method-id RPC envelope over QUIC:ClutchCallClient (dial, originate_bulk, hangup, barge,
push_audio, …) used this surface. It’s kept for backwards compat;
new code should use the Voice modality. See
Envelope Format for the full wire detail.
Code generation
Method IDs and DTO definitions in every language come from one IDL (api/clutchcall.json) compiled by apirpc_compiler.py. The
modality clients’ wire formats are similarly generated from the same
IDL, so a new modality method only needs an IDL edit + compiler run.
