Feasibility & Business Briefing · 2025-2026

Can an AI actually call people, and is it a business worth building?

A straight answer on AI outbound phone calling: where the technology genuinely works today, what a call really costs, the legal landmines that can end the business, and how much you can charge.

The honest verdict. Yes, AI can run goal-directed outbound calls today, and the unit economics are excellent (a 3-minute call costs about $0.18 to run and the service resells for $0.75 to $2.00). The technology is not the risk. The risk is legal: as of February 2024 every AI-voice call is a regulated "artificial voice" call under the TCPA, carrying $500 to $1,500 per call in damages with no cap. Build it for structured, consented calls and it is a real, high-margin business. Point it at cold strangers and it is a lawsuit generator.

01

Feasibility: can AI hold a real phone conversation?

Yes, with real caveats. AI handles a scripted, goal-directed outbound call at a quality that gets work done in favorable conditions. It cannot yet replicate natural, unstructured human conversation reliably at scale. The gap between a polished vendor demo and a production line handling thousands of calls is measured in 15 to 25 percentage points of task completion.

~200ms
Human turn-taking gap. Above 900ms callers start talking over the agent.
1.4-1.7s
Real production median latency, versus the sub-300ms vendors advertise. p99 runs 3 to 5 seconds.
15-25pp
Task-completion drop from clean demo to real PSTN, accents, and noise.

Where it works in production

Where it still breaks

Architecture note: cascaded speech-to-text then LLM then text-to-speech dominates production in 2026 for debuggability, compliance, and cost. End-to-end speech-to-speech (OpenAI Realtime, Gemini Live) preserves tone but costs roughly 10x more and follows instructions worse.

02

The stack: build, buy, or hybrid

Three shapes: a managed platform (Vapi, Bland, Retell), a self-assembled pipeline (telephony plus your own speech-to-text, LLM, and text-to-speech), or an open framework like Pipecat that gives you the pipeline plumbing without writing raw audio-socket code.

OptionShapeBarge-inBest for
Retell AIManaged, custom-LLM modeBuilt in (~800ms)Fastest credible MVP. Your server returns each line over a websocket.
VapiManaged, swappable partsBuilt in (sub-600ms)Quick ship, tool-calling via webhooks.
Bland.aiManaged, visual pathwaysOn by defaultNo-code flows, bring-your-own Twilio.
Twilio / TelnyxTelephony + media-stream socketYour code (VAD)The transport layer under any self-build. Telnyx is ~30-50% cheaper.
PipecatOpen framework (MIT)Built-in VAD processorSelf-host at scale; swap any provider; native tool-call handlers.

Recommended path

MVP: Retell AI in custom-LLM mode. It owns all telephony and audio, fires a websocket event on every turn, and your thin server returns the next line. You get barge-in, warm transfer, analytics, and number provisioning without operating audio infrastructure.

Scale: migrate to Telnyx (telephony) + Pipecat (framework) + Deepgram Nova-3 (speech-to-text) + a text LLM like Claude Haiku + Cartesia Sonic (text-to-speech). All-in around $0.045/min, which is what makes resale at $0.15 to $0.25/min profitable. Keep your LLM logic identical across the move.

03

What a call actually costs

Modeled at 150 words per minute of agent speech and a blended 3-minute connected call. Speech-to-text and text-LLM are rounding errors; text-to-speech and the realtime-audio LLM are where the money goes.

Per-minute, by layer. Canonical figures from current published vendor pricing.
LayerVendor$/min
TelephonyTelnyx (US outbound)$0.008
TelephonyTwilio (US outbound)$0.014
Speech-to-textDeepgram Nova-3$0.005
LLM (text path)GPT-4o-mini text$0.002
LLM (realtime audio)OpenAI gpt-realtime$0.10-0.30
Text-to-speechCartesia Sonic$0.034
Text-to-speechElevenLabs Flash$0.045
All-in, by stack. A phone number adds ~$1/month each; a local-presence pool of 50-100 numbers is $50-115/month.
Stack$/min3-min call
Self-assembled budget (Telnyx + Deepgram + GPT-4o-mini + ElevenLabs Flash)$0.060$0.18
Self-assembled premium$0.111$0.33
Managed (Bland Build plan)$0.120$0.36
Managed (Retell, mid)$0.165$0.50
Native realtime audio (Path B, mid)$0.165$0.50

Getting a number is instant via Twilio (local $1.15/mo, toll-free $2.15/mo) or Telnyx (from $1.00/mo). Local area codes answer 30-60% better than toll-free for cold outreach, but carriers scrutinize high volume from a single local line.

04

What you can charge, and the margin

Per-minute pricing is commoditizing fast (managed rates fell from $0.25/min in 2023 to $0.11-0.15 in 2026). The leverage is in outcome pricing, where the customer buys a booked meeting, not a minute.

Service pricing models versus underlying cost.
ModelMarket priceYour costGross margin
Per-minute markup$0.20-0.50/min$0.06-0.12/min60-75%
Per-call$0.75-2.00/call$0.18-0.50/call40-75%
Per-seat / month$99-499/moby usage30-75%
Per-appointment booked$50-300/appt$15-40/appt80-95%+

The per-appointment math

At a realistic 5% connect rate and 10% book rate, one booked meeting takes ~200 dial attempts, costing roughly $16 on the budget stack (or $30-40 premium). Sell that meeting at $100-200 and gross margin before overhead is 75-90%. The anchor: a fully-loaded human SDR costs $300-500 per booked meeting, so AI undercuts by 3x to 10x. The catch is conversion risk: if connect rates fall to carrier spam filtering or your list is weak, the margin evaporates fast.

06

B2B reality: the "businesses are exempt" myth

The federal B2B exemption is narrow: a live human, manually dialing a business landline. The moment you use an autodialer, an AI voice, or call a mobile number (which is nearly every business contact today), TCPA applies in full and prior express written consent is required.

2.3-2.5%
Median dial-to-meeting conversion for B2B cold calling. ~40 dials per meeting.
15% vs 25%
AI SDR versus human SDR meeting-to-opportunity conversion: a 40% deficit.
50-70%
Annual AI-SDR tool churn, roughly double human SDR turnover. The most honest market signal.

Where AI genuinely fits B2B: list-qualification sweeps, appointment reminders and confirmations, after-hours callback capture, later-touch follow-ups, and automatic CRM logging. Where it fails: replacing the human on a genuinely cold conversation, complex enterprise accounts, and regulated verticals. The teams that win run a hybrid: AI does research, prioritization, dialing, coaching, and logging; humans hold the conversation that matters.

07

Verdict and the MVP path

AI outbound calling is real, commercially deployed, and cheap to run. It is not human-equivalent and will not be for the hardest conversations for 2 to 3 years. The unit economics are genuinely strong; the business risk is almost entirely legal and reputational, not technical.

Do

Build a narrow, consented, structured product

Appointment confirmations, reminders, inbound overflow, after-hours callback, and qualification of opted-in leads. Compliance baked in from call one: consent records, AI self-disclosure, automated opt-out, DNC scrubbing.

Avoid

Cold AI dialing of strangers

Maximum legal exposure, fastest path to spam-flagging and burned lists, and the worst conversion. This is where the $500-per-call math turns lethal.

The concrete first build