Files

defiQUG 4945abcf7c Docs: Phoenix TTS contract, recommendations (TTS/Gitea/Phoenix), push-all note, Gitea labels for virtual-banker

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-02-10 16:54:22 -08:00

2.0 KiB

Raw Permalink Blame History

Phoenix TTS API contract (ElevenLabs-compatible)

Last Updated: 2026-02-10
Purpose: So virtual-banker (and other apps) can “just change endpoint” from ElevenLabs to a Phoenix-hosted TTS service.

Required endpoints

The Phoenix TTS service must implement the same HTTP contract as ElevenLabs for these paths (base path is the app’s /tts or similar; below uses prefix /v1).

1. Sync text-to-speech

Method: POST
Path: /v1/text-to-speech/:voice_id
Headers:
- Content-Type: application/json
- Accept: audio/mpeg
- Auth: either xi-api-key: <key> or Authorization: Bearer <token> (configurable in client)

Body (JSON):

{
  "text": "Hello world",
  "model_id": "eleven_multilingual_v2",
  "voice_settings": {
    "stability": 0.5,
    "similarity_boost": 0.75,
    "style": 0,
    "use_speaker_boost": true
  }
}

Response: 200 OK, body = raw mp3 bytes (audio/mpeg).

2. Streaming text-to-speech

Method: POST
Path: /v1/text-to-speech/:voice_id/stream
Headers: Same as sync.
Body: Same JSON as sync.
Response: 200 OK, body = streaming mp3 (same format).

3. Health (recommended)

Method: GET
Path: /health (at same origin as the TTS base URL, e.g. https://phoenix.example.com/tts/health if base is .../tts/v1)
Response: 200 OK (body optional; used for readiness).

Optional

Auth: If Phoenix uses a different scheme (e.g. Bearer only), clients set TTS_AUTH_HEADER_NAME / TTS_AUTH_HEADER_VALUE; no API change.
Visemes: For better lip-sync, a future endpoint could return phoneme/viseme timings; client would call it when available.

Reference

Virtual-banker TTS client: virtual-banker/backend/tts (see backend/tts/README.md).
ElevenLabs TTS API: Text-to-speech, Stream.

2.0 KiB Raw Permalink Blame History Unescape Escape