Co-authored-by: Cursor <cursoragent@cursor.com>
TTS package — ElevenLabs-compatible, Phoenix endpoint swap
This package provides a text-to-speech client that matches the ElevenLabs TTS API contract. You can point it at ElevenLabs or at a Phoenix-hosted TTS service that implements the same API shape; switching is a config change (base URL), no code change.
Note: The repo eleven-labs/api-service on GitHub is a PHP OpenAPI consumer library, not the voice TTS API. This client targets the REST TTS API at api.elevenlabs.io (and compatible backends).
Parity with ElevenLabs TTS API
| Feature | ElevenLabs API | This client |
|---|---|---|
Sync POST /v1/text-to-speech/:voice_id |
✅ | ✅ Synthesize |
Stream POST /v1/text-to-speech/:voice_id/stream |
✅ | ✅ SynthesizeStream |
| Voice settings (stability, similarity_boost, style, speaker_boost) | ✅ | ✅ VoiceConfig |
Model (model_id) |
✅ | ✅ SetModelID / default eleven_multilingual_v2 |
Auth xi-api-key header |
✅ | ✅ |
Output Accept: audio/mpeg (mp3) |
✅ | ✅ |
| Retries (5xx, backoff) | — | ✅ on sync |
| Visemes (lip sync) | ❌ (no phoneme API) | ✅ client-side approximation |
Optional ElevenLabs features not used here: output_format query, optimize_streaming_latency, WebSocket streaming. For “just change endpoint” to Phoenix, the host only needs to implement the same sync + stream JSON body and return audio/mpeg.
Which TTS backend? (decision table)
| Env / condition | Backend used |
|---|---|
TTS_VOICE_ID unset (or no auth) |
Mock (no real synthesis) |
TTS_VOICE_ID + TTS_API_KEY or ELEVENLABS_* set, TTS_BASE_URL unset |
ElevenLabs (api.elevenlabs.io) |
TTS_BASE_URL set (e.g. Phoenix) + auth + voice |
Phoenix (or other compatible host) |
USE_PHOENIX_TTS=true |
Prefer Phoenix; use TTS_BASE_URL or PHOENIX_TTS_BASE_URL |
Auth: default header is xi-api-key (ElevenLabs). For Phoenix with Bearer token set TTS_AUTH_HEADER_NAME=Authorization and TTS_AUTH_HEADER_VALUE=Bearer <token>.
Using with Phoenix (swap endpoint)
-
Phoenix TTS service must expose the same contract:
POST /v1/text-to-speech/:voice_id— body:{"text","model_id","voice_settings"}→ response: raw mp3POST /v1/text-to-speech/:voice_id/stream— same body → response: streaming mp3- Health:
GET /healthat the same origin (e.g.{baseURL}/../health) returning 2xx sotts.Service.Health(ctx)can be used for readiness.
-
Configure the app with the Phoenix base URL (and optional auth):
export TTS_BASE_URL="https://phoenix.example.com/tts/v1" export TTS_VOICE_ID="default-voice-id" # Optional: Phoenix uses Bearer token export TTS_AUTH_HEADER_NAME="Authorization" export TTS_AUTH_HEADER_VALUE="Bearer your-token" # Or feature flag to force Phoenix export USE_PHOENIX_TTS=true export PHOENIX_TTS_BASE_URL="https://phoenix.example.com/tts/v1" -
Health check: The client’s
Health(ctx)callsGET {baseURL}/../healthwhen base URL is not ElevenLabs. Wire this into your readiness probe or a/readyendpoint if you need TTS to be up before accepting traffic. -
In code (e.g. for reuse in another project):
opts := tts.TTSOptions{ BaseURL: "https://phoenix.example.com/tts/v1", AuthHeaderName: "Authorization", AuthHeaderValue: "Bearer token", } svc := tts.NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts) if err := svc.Health(ctx); err != nil { /* not ready */ } audio, err := svc.Synthesize(ctx, "Hello world")
No code change beyond config: same interface, different base URL and optional auth header.
Reuse across projects
This package lives in virtual-banker and can be depended on as a Go module path (e.g. github.com/your-org/virtual-banker/backend/tts or via a shared repo). Any project that needs TTS can:
- Depend on this package.
- Use
tts.Serviceand eitherNewMockTTSService()orNewElevenLabsTTSServiceWithOptions(apiKey, voiceID, baseURL)/NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts)for custom auth. - Set
baseURLto ElevenLabs (""orhttps://api.elevenlabs.io/v1) or to the Phoenix TTS base URL.
The interface (Synthesize, SynthesizeStream, GetVisemes) stays the same regardless of backend.