Files
virtual-banker/backend/tts/README.md
defiQUG 9839401d1d
Some checks failed
CI / build (push) Has been cancelled
TTS: configurable auth, Health check, Phoenix options; .env.example; Gitea CI workflow
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 16:54:10 -08:00

4.5 KiB
Raw Blame History

TTS package — ElevenLabs-compatible, Phoenix endpoint swap

This package provides a text-to-speech client that matches the ElevenLabs TTS API contract. You can point it at ElevenLabs or at a Phoenix-hosted TTS service that implements the same API shape; switching is a config change (base URL), no code change.

Note: The repo eleven-labs/api-service on GitHub is a PHP OpenAPI consumer library, not the voice TTS API. This client targets the REST TTS API at api.elevenlabs.io (and compatible backends).


Parity with ElevenLabs TTS API

Feature ElevenLabs API This client
Sync POST /v1/text-to-speech/:voice_id Synthesize
Stream POST /v1/text-to-speech/:voice_id/stream SynthesizeStream
Voice settings (stability, similarity_boost, style, speaker_boost) VoiceConfig
Model (model_id) SetModelID / default eleven_multilingual_v2
Auth xi-api-key header
Output Accept: audio/mpeg (mp3)
Retries (5xx, backoff) on sync
Visemes (lip sync) (no phoneme API) client-side approximation

Optional ElevenLabs features not used here: output_format query, optimize_streaming_latency, WebSocket streaming. For “just change endpoint” to Phoenix, the host only needs to implement the same sync + stream JSON body and return audio/mpeg.


Which TTS backend? (decision table)

Env / condition Backend used
TTS_VOICE_ID unset (or no auth) Mock (no real synthesis)
TTS_VOICE_ID + TTS_API_KEY or ELEVENLABS_* set, TTS_BASE_URL unset ElevenLabs (api.elevenlabs.io)
TTS_BASE_URL set (e.g. Phoenix) + auth + voice Phoenix (or other compatible host)
USE_PHOENIX_TTS=true Prefer Phoenix; use TTS_BASE_URL or PHOENIX_TTS_BASE_URL

Auth: default header is xi-api-key (ElevenLabs). For Phoenix with Bearer token set TTS_AUTH_HEADER_NAME=Authorization and TTS_AUTH_HEADER_VALUE=Bearer <token>.


Using with Phoenix (swap endpoint)

  1. Phoenix TTS service must expose the same contract:

    • POST /v1/text-to-speech/:voice_id — body: {"text","model_id","voice_settings"} → response: raw mp3
    • POST /v1/text-to-speech/:voice_id/stream — same body → response: streaming mp3
    • Health: GET /health at the same origin (e.g. {baseURL}/../health) returning 2xx so tts.Service.Health(ctx) can be used for readiness.
  2. Configure the app with the Phoenix base URL (and optional auth):

    export TTS_BASE_URL="https://phoenix.example.com/tts/v1"
    export TTS_VOICE_ID="default-voice-id"
    # Optional: Phoenix uses Bearer token
    export TTS_AUTH_HEADER_NAME="Authorization"
    export TTS_AUTH_HEADER_VALUE="Bearer your-token"
    # Or feature flag to force Phoenix
    export USE_PHOENIX_TTS=true
    export PHOENIX_TTS_BASE_URL="https://phoenix.example.com/tts/v1"
    
  3. Health check: The clients Health(ctx) calls GET {baseURL}/../health when base URL is not ElevenLabs. Wire this into your readiness probe or a /ready endpoint if you need TTS to be up before accepting traffic.

  4. In code (e.g. for reuse in another project):

    opts := tts.TTSOptions{
        BaseURL:         "https://phoenix.example.com/tts/v1",
        AuthHeaderName:  "Authorization",
        AuthHeaderValue: "Bearer token",
    }
    svc := tts.NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts)
    if err := svc.Health(ctx); err != nil { /* not ready */ }
    audio, err := svc.Synthesize(ctx, "Hello world")
    

No code change beyond config: same interface, different base URL and optional auth header.


Reuse across projects

This package lives in virtual-banker and can be depended on as a Go module path (e.g. github.com/your-org/virtual-banker/backend/tts or via a shared repo). Any project that needs TTS can:

  • Depend on this package.
  • Use tts.Service and either NewMockTTSService() or NewElevenLabsTTSServiceWithOptions(apiKey, voiceID, baseURL) / NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts) for custom auth.
  • Set baseURL to ElevenLabs ("" or https://api.elevenlabs.io/v1) or to the Phoenix TTS base URL.

The interface (Synthesize, SynthesizeStream, GetVisemes) stays the same regardless of backend.