Files

CI / build (push) Has been cancelled

Details

TTS: configurable auth, Health check, Phoenix options; .env.example; Gitea CI workflow

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-02-10 16:54:10 -08:00

4.5 KiB

Raw Blame History

TTS package — ElevenLabs-compatible, Phoenix endpoint swap

This package provides a text-to-speech client that matches the ElevenLabs TTS API contract. You can point it at ElevenLabs or at a Phoenix-hosted TTS service that implements the same API shape; switching is a config change (base URL), no code change.

Note: The repo eleven-labs/api-service on GitHub is a PHP OpenAPI consumer library, not the voice TTS API. This client targets the REST TTS API at api.elevenlabs.io (and compatible backends).

Parity with ElevenLabs TTS API

Feature	ElevenLabs API	This client
Sync `POST /v1/text-to-speech/:voice_id`	✅	✅ `Synthesize`
Stream `POST /v1/text-to-speech/:voice_id/stream`	✅	✅ `SynthesizeStream`
Voice settings (stability, similarity_boost, style, speaker_boost)	✅	✅ `VoiceConfig`
Model (`model_id`)	✅	✅ `SetModelID` / default `eleven_multilingual_v2`
Auth `xi-api-key` header	✅	✅
Output `Accept: audio/mpeg` (mp3)	✅	✅
Retries (5xx, backoff)	—	✅ on sync
Visemes (lip sync)	❌ (no phoneme API)	✅ client-side approximation

Optional ElevenLabs features not used here: output_format query, optimize_streaming_latency, WebSocket streaming. For “just change endpoint” to Phoenix, the host only needs to implement the same sync + stream JSON body and return audio/mpeg.

Which TTS backend? (decision table)

Env / condition	Backend used
`TTS_VOICE_ID` unset (or no auth)	Mock (no real synthesis)
`TTS_VOICE_ID` + `TTS_API_KEY` or `ELEVENLABS_*` set, `TTS_BASE_URL` unset	ElevenLabs (api.elevenlabs.io)
`TTS_BASE_URL` set (e.g. Phoenix) + auth + voice	Phoenix (or other compatible host)
`USE_PHOENIX_TTS=true`	Prefer Phoenix; use `TTS_BASE_URL` or `PHOENIX_TTS_BASE_URL`

Auth: default header is xi-api-key (ElevenLabs). For Phoenix with Bearer token set TTS_AUTH_HEADER_NAME=Authorization and TTS_AUTH_HEADER_VALUE=Bearer <token>.

Using with Phoenix (swap endpoint)

Phoenix TTS service must expose the same contract:
- POST /v1/text-to-speech/:voice_id — body: {"text","model_id","voice_settings"} → response: raw mp3
- POST /v1/text-to-speech/:voice_id/stream — same body → response: streaming mp3
- Health: GET /health at the same origin (e.g. {baseURL}/../health) returning 2xx so tts.Service.Health(ctx) can be used for readiness.

Configure the app with the Phoenix base URL (and optional auth):

export TTS_BASE_URL="https://phoenix.example.com/tts/v1"
export TTS_VOICE_ID="default-voice-id"
# Optional: Phoenix uses Bearer token
export TTS_AUTH_HEADER_NAME="Authorization"
export TTS_AUTH_HEADER_VALUE="Bearer your-token"
# Or feature flag to force Phoenix
export USE_PHOENIX_TTS=true
export PHOENIX_TTS_BASE_URL="https://phoenix.example.com/tts/v1"

Health check: The client’s Health(ctx) calls GET {baseURL}/../health when base URL is not ElevenLabs. Wire this into your readiness probe or a /ready endpoint if you need TTS to be up before accepting traffic.

In code (e.g. for reuse in another project):

opts := tts.TTSOptions{
    BaseURL:         "https://phoenix.example.com/tts/v1",
    AuthHeaderName:  "Authorization",
    AuthHeaderValue: "Bearer token",
}
svc := tts.NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts)
if err := svc.Health(ctx); err != nil { /* not ready */ }
audio, err := svc.Synthesize(ctx, "Hello world")

No code change beyond config: same interface, different base URL and optional auth header.

Reuse across projects

This package lives in virtual-banker and can be depended on as a Go module path (e.g. github.com/your-org/virtual-banker/backend/tts or via a shared repo). Any project that needs TTS can:

Depend on this package.
Use tts.Service and either NewMockTTSService() or NewElevenLabsTTSServiceWithOptions(apiKey, voiceID, baseURL) / NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts) for custom auth.
Set baseURL to ElevenLabs ("" or https://api.elevenlabs.io/v1) or to the Phoenix TTS base URL.

The interface (Synthesize, SynthesizeStream, GetVisemes) stays the same regardless of backend.

4.5 KiB Raw Blame History Unescape Escape