Run ElevenLabs TTS without burning characters

ElevenLabs charges per character, and the multiplier changes by model. Agents that pick the wrong model can spend 2x the credits for output the user could not tell apart in a blind test. Voice cloning has hard legal rules and the free tier has no commercial license. This skill teaches your agent the model trade-offs, the SDK shape, the official MCP tools, and the legal gotchas so the user is not surprised by their next invoice.

When to use ElevenLabs

Reach for ElevenLabs when the user wants a voiceover, narration, audiobook chapter, or character read; real-time TTS for an agent (Flash v2.5); a read in 1 of 70+ languages (v3) or 29 languages (Multilingual v2); to clone their own voice or one they have written permission to use; or to transcribe audio with word-level timestamps (Scribe).

Do not reach for ElevenLabs when the user wants music with vocals (use Suno), wants to clone a celebrity or third-party voice without consent (legal hard no, see below), or wants lip-sync video (chain to Hedra or HeyGen after generating audio).

Install

pip install elevenlabs            # Python SDK
npm install @elevenlabs/elevenlabs-js   # JS / Node SDK

Official MCP server (Claude Code, Claude Desktop, Cursor, Windsurf, OpenCode):

{
  "mcpServers": {
    "ElevenLabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp"],
      "env": { "ELEVENLABS_API_KEY": "<your-api-key>" }
    }
  }
}

The official server exposes: , , , , , , , , , , , plus agent-platform tools (, , , , , ). Default output dir is , override with .

Model ID	Cost / char	Latency	Max chars	Languages	Use case
`eleven_flash_v2_5`	0.5x	~75 ms	40,000	32	Real-time agents, live apps, bulk narration on a budget
`eleven_turbo_v2_5`	0.5x	~250 ms	40,000	32	Legacy real-time pick, prefer Flash v2.5
`eleven_multilingual_v2`	1x	higher	10,000	29	Audiobook quality, stable reads, default for production VO
`eleven_v3`	1x	standard	5,000	70+	Character dialogue, dramatic delivery, audio tags like `[whispers]`, broadest language coverage

Run ElevenLabs TTS without burning characters

Run ElevenLabs TTS without burning characters

When to use ElevenLabs

Install

Basic call shape

Model selection

Voice selection and cloning

Streaming pattern

Voice settings worth tuning

Cost gotchas

What to deliver to the user

What NOT to do

Useful follow-ups after a successful generation