An agent DAG is a directed-acyclic graph describing how the gateway
should drive an AI conversation: which model to call, which tools it
can use, which fallback prompts to play, and which conditions trigger
hand-off to a human. Trunks bind to a DAG by agent_id; calls that
land on the trunk inherit it.
DAG shape
A DAG is a JSON document. The gateway treats the body as opaque on the
wire (it’s a string); validation happens server-side after decode.
The shape is:
{
"id": "support-bot-v3",
"version": 7,
"entry": "greet",
"nodes": {
"greet": {
"type": "tts",
"args": { "text": "Hello, how can I help?" },
"next": "listen"
},
"listen": {
"type": "stt",
"args": { "model": "deepgram-nova-2" },
"next": "decide"
},
"decide": {
"type": "llm",
"args": { "model": "gpt-4.1", "tools": ["lookup_order"] },
"branches": {
"tool:lookup_order": "do_lookup",
"intent:human": "transfer",
"*": "listen"
}
},
"do_lookup": { "type": "tool", "args": { "name": "lookup_order" }, "next": "decide" },
"transfer": { "type": "transfer", "args": { "trunk_id": "support-queue" } }
}
}
Node node_type values the runtime recognises today:
| Group | Values | Notes |
|---|
| Conversational | ASR, LLM, TTS, PUSH_AUDIO, REALTIME | Default cascaded pipeline is ASR → LLM → TTS → PUSH_AUDIO. REALTIME collapses ASR+LLM+TTS into one streaming provider session (OpenAI Realtime today). |
| IVR (PR3) | GREETING, GATHER, MENU | Branching nodes — MENU and GATHER set per-branch downstreams instead of a single next_node. |
| Control | TRANSFER, HANGUP | Terminal control nodes; flow leaves the DAG after these. |
Node types are case-sensitive (uppercase). The control-plane API’s
PipelineNodeConfig enum lives in
agent-config.ts.
New types are additive and ignored by older runtimes (forward-compat).
Conversational nodes
| Field | Type | Notes |
|---|
provider | string | Provider key. ASR: deepgram/whisper. LLM: openai/anthropic/gemini/ollama. TTS: deepgram/elevenlabs. REALTIME: openai. |
model | string | Freeform model identifier passed through to the provider (e.g. claude-sonnet-4-6, gpt-4o-realtime-preview). |
system_prompt | string | LLM nodes only. |
voice | string | TTS / REALTIME nodes. |
temperature | number | LLM / REALTIME. |
language | string | ISO 639-1 hint. REALTIME nodes get a hard directive prepended to the system prompt so the model doesn’t mirror the caller’s language. |
next_node | string | Where flow goes after this node completes. |
IVR nodes
GREETING plays an opening message; GATHER waits for DTMF input;
MENU routes to one of N per-digit downstream nodes (think
“press 1 for sales”).
An llm node’s args.tools is a list of tool names. Names resolve
through the runtime’s ToolRegistry — the LLM sees the tool’s name,
description, and JSON schema; the runtime executes it and feeds the
result back as a tool_result.
Every session running through ClutchCall’s agent runtime
automatically advertises these to the LLM:
| Name | Args | What it does |
|---|
transfer_call | destination (E.164 / SIP URI), caller_id? | RFC 3515 SIP REFER. Subject to per-trunk realm policy (see below); denied transfers return a structured transfer_denied envelope instead of dispatching. |
hold_call | none | sendrecv → sendonly re-INVITE. |
unhold_call | none | Lifts a previous hold_call. Idempotent on a non-held call. |
disconnect_call | reason? | Wire-level BYE — distinct from the model saying “goodbye”. reason lands in the CDR. |
send_dtmf | digit, duration_ms? (40–2000, default 100), mode? (rfc2833|inband, default rfc2833) | DTMF for IVR navigation post-transfer. |
request_supervisor | reason (required), urgency? (low|medium|high, default medium) | Flags the call for human attention without dispatching audio — operators decide whether to actually join via the portal. Pair with transfer_call to actually route. |
get_current_time | none | ISO-8601 UTC timestamp. |
To hide a subset per-tenant, write the names to
clutchcall:tenant:<tenant_id>:telephony_disabled_tools as a
JSON array. Disabled tools are filtered out before the toolset reaches
the model, and re-checked defensively at invoke time so a stale provider
session can’t bypass the gate.
Realm policy on transfer_call. Each trunk carries a list of
transfer rules cached in
clutchcall:trunk:<trunk_id>:transfer_rules. The runtime walks
them on every transfer attempt, first match wins.
When no rule matches:
- Same-realm transfers are allowed (
reason="same_realm_default").
- Cross-realm transfers are soft-allowed with a warning
(
reason="cross_realm_default_warn") — logged + recorded in CDR but
the transfer proceeds. To enforce strict default-deny, write an
explicit catch-all deny rule. The soft default exists so operators
who haven’t authored rules yet aren’t locked out of every cross-realm
transfer.
A rule that matches with action: "deny" returns a structured
envelope to the LLM —
{
"error": "transfer_denied",
"reason": "cross-realm transfer requires explicit allow rule",
"destination": "+15551234567",
"source_realm": "internal",
"dest_realm": "external",
"matched_priority": -1
}
— so the model can fall back, pick a different destination, or explain
the limitation to the caller. The downstream gateway never sees the
denied transfer.
Operators add per-agent tools through the portal without rebuilding the
runtime. Tools are persisted in Redis at agent:<agent_id>:config.tools[]
and hydrated on the first audio frame of each new session. They share
the per-session ToolRegistry with the implicit toolset above —
collisions resolve in favour of the operator-defined tool, so a
transfer_call declared on the agent overrides the built-in.
| Kind | Status | What it is |
|---|
http | shipped | Calls an arbitrary HTTP(S) endpoint. URL, headers, and body templates support {{argument}} substitution from the LLM’s tool arguments. Auth: bearer, basic, or custom header. 10s default timeout. |
mcp | shipped | JSON-RPC 2.0 tools/call against a remote MCP server over HTTP. Each tool is declared explicitly; tools/list auto-discovery is not yet wired. Same auth shapes as http. |
client | scaffolded | The tool is advertised to the LLM and the call is parsed, but invoke() currently returns a not_yet_plumbed error. End-to-end dispatch to an SDK-side handler lands with the QaSupervisor → MoQ migration. |
Per-tool spec shape (one entry of tools[]):
{
"kind": "http",
"name": "lookup_order",
"description": "Look up an order by its ID.",
"silent": false,
"spec": {
"method": "GET",
"url": "https://api.example.com/orders/{{order_id}}",
"headers": { "X-Tenant": "{{tenant_id}}" },
"auth": { "type": "bearer", "token": "..." },
"parameters": {
"type": "object",
"properties": {
"order_id": { "type": "string", "description": "Order id to look up." }
},
"required": ["order_id"]
},
"timeout_ms": 10000
}
}
HTTP and MCP tool invokes are synchronous within the LLM turn — a slow
upstream eats the turn budget. Keep individual tools fast (sub-second
ideal), or chain multiple LLM turns rather than cascading tool calls
inside one turn.
For tools that need privileged access to the runtime — realm policy
evaluation, telephony RPC dispatch, recording control, supervisor
signalling — subclass host::core::Tool and register_tool() in
main.cc. This path is for platform developers; tenants and operators
should use the http / mcp kinds above. See the
clutchcall-tool-calling skill for the C++ shape and the
register_tool call site.
When you bridge the call yourself (default_app=ANSWER), you handle
tool calling directly via the LLM provider’s wire format — for OpenAI
Realtime that’s a tools array in session.update, with
response.function_call_arguments.done events flowing back. The
clutchcall-tool-calling skill walks through the round-trip.
PublishAgentDag
Request (PublishAgentDagRequest):
| Field | Type | Notes |
|---|
admin_token | string | |
tenant_id | string | |
agent_id | string | Stable id; matches AddTrunkRequest.agent_id. |
dag_json | string | The DAG document above. Validated server-side. |
comment | string | Free-form changelog; surfaces in ListAgentDags. |
Response (PublishAgentDagResponse):
| Field | Type | Notes |
|---|
status | string | "ok" or "error". |
error_message | string | |
version | int32 | Auto-incremented; version=1 for the first publish. |
published_at_ms | int64 | |
Each publish creates a new version; the gateway keeps the last 50 by
default. New calls bind to the latest version at the moment of dial.
GetAgentDag
| Field | Type | Notes |
|---|
admin_token | string | |
tenant_id | string | |
agent_id | string | |
version | int32 | 0 = latest. >0 = specific historical version. |
Response (GetAgentDagResponse):
| Field | Type | Notes |
|---|
found | bool | |
agent_id | string | |
version | int32 | Resolved version (in case you asked for 0). |
dag_json | string | The DAG document. |
published_at_ms | int64 | |
comment | string | |
ListAgentDags
| Field | Type |
|---|
admin_token | string |
tenant_id | string |
Response (ListAgentDagsResponse):
| Field | Type | Notes |
|---|
agents | vector<AgentSummary> | Latest version per agent_id. |
AgentSummary:
| Field | Type |
|---|
agent_id | string |
latest_version | int32 |
published_at_ms | int64 |
comment | string |
DeleteAgentDag
| Field | Type | Notes |
|---|
admin_token | string | |
tenant_id | string | |
agent_id | string | |
purge_history | bool | true = also drop version history; false = keep it. |
Returns Empty. Trunks pointing at the deleted agent_id will start
failing inbound HANDLE_AI routing with ERR_INVALID_DESTINATION until
you either rebind them to another agent or republish.