# Telephony Concepts

> Industry vocabulary used throughout this documentation.

Brand-neutral reference for the telephony terminology that appears
across the rest of the docs. The "Where this maps to ClutchCall"
section at the bottom links each concept area to the page that
implements it.

## Networks and origination

| Term | What it means |
| ---- | ------------- |
| **PSTN** | Public Switched Telephone Network — the global circuit-switched / IP voice network that terminates real phone numbers. |
| **VoIP** | Voice over IP — voice carried as packets over data networks instead of circuits. |
| **SIP** | Session Initiation Protocol (RFC 3261) — the signaling layer that sets up, modifies, and tears down voice sessions. |
| **RTP** | Real-time Transport Protocol — the media layer that actually carries the audio packets after SIP has set up the call. |
| **SRTP** | Secure RTP — RTP encrypted with DTLS-SRTP keys negotiated during signaling. WebRTC mandates it. |
| **DTLS-SRTP** | Datagram TLS used to negotiate SRTP keys. The standard WebRTC media-encryption handshake. |
| **ICE / STUN / TURN** | NAT-traversal toolkit for peer-to-peer media. ICE picks the candidate path; STUN discovers public addresses; TURN relays media when peer-to-peer is blocked. WebRTC uses all three. |
| **SDP** | Session Description Protocol — the offer/answer body inside SIP/WebRTC that says "I speak Opus and PCMU on UDP/16384". |
| **DID** | Direct Inward Dial — a phone number that routes inbound calls into a specific extension or trunk. "Buy a DID" = rent a phone number. |
| **DDI** | European synonym for DID. |
| **Toll-free number (TFN)** | 800/888/etc numbers; the **callee** pays the per-minute cost. |
| **E.164** | The international number format: `+<country><number>`, max 15 digits. Always quote and store numbers in E.164. |
| **CNAM** | Caller-ID Name. The string the receiving phone displays. Distinct from the calling number. |
| **CLI / Caller-ID** | The number presented to the called party. Subject to regulatory rules (US STIR/SHAKEN, EU caller-ID spoofing laws). |
| **ANI** | Automatic Number Identification — the *billing* number on the receiving side, used by 911 and toll-free services. |

## Trunks and carriers

| Term | What it means |
| ---- | ------------- |
| **SIP trunk** | A logical connection between a voice platform and a carrier (or PBX). Has channels, codecs, auth credentials, and an IP allowlist. |
| **CPS** | Calls Per Second — pacing limit a trunk accepts before throttling. Carriers price by CPS. |
| **Channels** | Concurrent in-flight calls a trunk can hold. `CPS × ACD` ≈ steady-state channel usage. |
| **SBC** | Session Border Controller — a hardened SIP/RTP proxy at the edge of a network. Handles NAT, security, transcoding. |
| **PBX** | Private Branch eXchange — an enterprise call switch. Sits between phones and trunks. |
| **Origination** | Outbound calls — your platform → PSTN. |
| **Termination** | The carrier's term for "delivering" your outbound call to its destination. Confusingly, also the call's hangup. |
| **Number portability (LNR / LNP)** | Ability to keep a phone number when switching carriers. Adds a routing lookup before the call leaves your network. |
| **DTMF** | The 0–9, *, # tones. Three signaling modes: in-band (audible), RFC 2833 (RTP events), or SIP INFO. RFC 2833 is preferred wherever possible. |

## Codecs and media

| Term | What it means |
| ---- | ------------- |
| **G.711 µ-law (PCMU)** | 8 kHz, 64 kbit/s codec. North-American PSTN standard. Lossy log compression of 14-bit linear samples. |
| **G.711 A-law (PCMA)** | Same as µ-law, different log curve. EU/ROW PSTN standard. |
| **Opus** | Modern wideband codec, 6 kbit/s – 510 kbit/s. WebRTC default. Sounds dramatically better than G.711. |
| **G.722** | 16 kHz wideband codec at 64 kbit/s. "HD voice" on a lot of mobile networks. |
| **AMR / AMR-WB** | Mobile-network codecs. Rare on PSTN handoff. |
| **Linear16 / PCM16** | Uncompressed 16-bit PCM. What ML/AI services usually want. |
| **Transcoding** | Converting one codec to another mid-call. Costs CPU; introduces ≥ 20 ms of latency. |
| **Jitter** | Variation in packet inter-arrival time. Measured in ms. Anything > 30 ms typically requires de-jitter buffering. |
| **Jitter buffer** | An adaptive queue that smooths jitter at the cost of latency. Bigger buffer = better audio, more delay. |
| **Packet loss** | % of RTP packets that never arrive. > 1 % is audible; > 3 % is unintelligible. |
| **PLC** | Packet Loss Concealment — synthesizes plausible audio to mask a missing packet. Most modern codecs include it. |
| **MOS** | Mean Opinion Score, 1.0–5.0. Subjective quality. ≥ 4.0 = "toll quality". |
| **R-factor** | Objective ITU E-Model score 0–100. Drives an estimated MOS. |
| **VAD** | Voice Activity Detection. Suppresses silence frames or triggers barge-in. |
| **AEC / AGC / NS** | Acoustic Echo Cancellation / Auto Gain Control / Noise Suppression — the three pillars of the WebRTC audio-processing pipeline. |
| **Comfort noise** | Synthetic background hiss inserted during silence to keep the line from sounding "dead". |

## Transports — how the audio actually travels

The four transports any modern voice platform has to know:

| Transport | What it is | Where it's used |
| --------- | ---------- | --------------- |
| **SIP + RTP** | Classic VoIP. Signaling on SIP (TCP/UDP/TLS), media on RTP/UDP. The carrier-facing protocol. | Carrier ↔ gateway trunks. |
| **WebRTC** | Browser-native voice/video. SDP over WebSocket signaling, **DTLS-SRTP** media over UDP, with **ICE** for NAT traversal. Mandatory secure, peer-to-peer-by-default media. | Browser softphones, click-to-dial, agent dashboards, embedded contact-centre widgets, integrations with LiveKit / Daily / Chime / Twilio Media Streams / Vapi. |
| **WebTransport** | HTTP/3 alternative for browser ↔ server bidirectional streams. Same underlying QUIC, no peer-to-peer, no ICE. | Browser apps that want a low-overhead control plane multiplexed with media on the same QUIC connection. |
| **QUIC** | UDP-based encrypted multiplexed transport. Native SDKs use it directly to talk to the gateway. | Server-side Python / Go / Rust / Java / .NET workloads. |

WebRTC vs SIP — the rule of thumb:

- **SIP/RTP** for trunk-side (carrier ↔ gateway). The PSTN doesn't speak WebRTC.
- **WebRTC** for last-mile to humans in browsers or vendor SDKs (LiveKit, Daily, Chime, Vapi, Twilio Media Streams).
- **QUIC / WebTransport** for server-side or browser control workloads where the audio source is already-decoded PCM (e.g. a TTS pipeline) and you want minimum framing overhead.

## Call lifecycle and patterns

| Term | What it means |
| ---- | ------------- |
| **A-leg / B-leg** | A-leg = caller side; B-leg = callee side. A *bridge* connects them. |
| **Ringback tone** | The "ring ring" the caller hears before answer. Generated locally OR by the far end (early media). |
| **Early media** | Audio sent *before* the SIP `200 OK`. Used for "the number you have dialed…" intercepts. |
| **Bridge** | Connect two legs so audio flows both ways. |
| **Park** | Hold a call without bridging — keep the line open with music or silence. |
| **Music on hold (MOH)** | Audio played to a parked call. Configurable via URL or stream. |
| **Transfer (blind / attended)** | Industry terms for handing a call to another destination. *Not implemented* — out of scope on this platform. |
| **Conference** | Industry term for mixing three or more legs into one audio stream. *Not implemented* — out of scope on this platform. |
| **Whisper / Coach** | Industry terms for one-way supervisor audio. *Not implemented* — the platform's `Barge` is unrelated; see the next row. |
| **Barge (AI barge-in)** | The headline AI-network unblocker. While an AI agent is speaking, the gateway listens for the human caller's voice on the inbound leg; the moment it crosses the configured energy / VAD / semantic threshold, the AI's outbound audio is gated and the human is heard immediately. Without it, AI voicebots feel deaf — the human has to wait for the AI to finish before being acknowledged. Configured per-call via `auto_barge_in`, `barge_in_patience_ms`, and per-trunk `auto_bargein_mode` (`energy` / `vad` / `semantic` / `off`). |
| **Barge-in patience** | Grace window in milliseconds between detecting the human's voice and gating the AI. Too short = AI stops on a cough; too long = AI talks over the caller. Typical: 150–350 ms. |
| **IVR** | Interactive Voice Response — menus driven by DTMF or speech. |
| **ACD** | Automatic Call Distribution — routing rules that send inbound calls to queues, agents, skill groups. |
| **AMD** | Answering-Machine Detection — heuristic classifier that says "human" vs "voicemail" within the first ~2 s of audio. |

## Routing and dialplan

| Term | What it means |
| ---- | ------------- |
| **Dialplan** | The rules engine that decides what happens to a call: hang up, park, play, bridge, hand to AI. |
| **LCR** | Least-Cost Routing — pick the cheapest carrier per destination prefix. |
| **Failover routing** | Try carrier A; on `503 Service Unavailable`, try carrier B. Trip a circuit breaker after N consecutive failures. |
| **Geo-routing** | Choose a regional gateway based on caller location, latency, or compliance. |
| **Whitelist / Blocklist** | Static allow/deny lists per trunk. Often required by carriers for outbound caller-ID. |
| **Inbound rule** | What to do when a PSTN call arrives at one of your DIDs: REJECT, PLAY-AND-HANGUP, NOTIFY-AND-HANGUP, or HANDLE-AI. |

## Compliance vocabulary

| Term | What it means |
| ---- | ------------- |
| **STIR/SHAKEN** | US/CA framework for caller-ID attestation. Every outbound INVITE carries an `Identity` JWT signed by your carrier or platform. Attestation A/B/C controls how trustworthy the receiving carrier marks your call. |
| **TCPA** | US Telephone Consumer Protection Act. Regulates outbound dialing — quiet hours, abandoned-call rate caps, consent requirements. |
| **Do-Not-Call (DNC)** | Federal + state lists. DNC suppression must run before every dial. |
| **GDPR / CCPA** | Privacy regimes that govern call recordings and CDR storage. Affects retention windows. |
| **HIPAA** | US health-info privacy. Healthcare deployments require a BAA + encrypted recording storage. |
| **PCI-DSS** | Card-data privacy. If your IVR collects card numbers, the IVR step must redact DTMF from recordings ("DTMF masking"). |
| **E911 / 112** | Emergency calling with location attached as PIDF-LO. Mandatory for any platform that lets users dial out. |
| **CALEA / Lawful Intercept** | Government wiretap obligations for carriers. Honoured at the upstream carrier, not at the gateway. |

## Where this maps to ClutchCall

For each section above, the corresponding handle on this platform:

| Concept area | Lives in |
| ------------ | -------- |
| Trunk lifecycle, codec preferences, inbound rules | [Admin → Trunks](/admin/trunks) |
| Outbound origination, bulk dialer, hangup, barge | Public RPC — see [Method IDs](/rpc/method-ids) |
| Dialplan actions (park, MOH, playback, transfer) | `ExecuteDialplan` and bucket actions |
| WebRTC vendor adapters (Browser, LiveKit, Daily, Chime, Twilio, Vapi) | [Platform → WebRTC Vendors](/platform/webrtc-vendors) |
| Compliance headers (STIR/SHAKEN, PANI, PIDF-LO) | [Platform → Regulatory](/platform/regulatory) |
| MOS, jitter, packet loss, CDRs | [Platform → Telemetry](/platform/telemetry) |

## The four numbers that bound a voice deployment

Any sizing or capacity discussion comes back to these:

1. **CPS** sustained (per trunk and aggregate).
2. **Concurrent channel** ceiling.
3. **End-to-end latency** (caller mouth → callee ear, in ms).
4. **MOS** under typical packet-loss conditions.

Codecs, dialplan, AI bridges, and WebRTC interop all hang off those
four.
