2.0 KiB
2.0 KiB
Phoenix TTS API contract (ElevenLabs-compatible)
Last Updated: 2026-02-10
Purpose: So virtual-banker (and other apps) can “just change endpoint” from ElevenLabs to a Phoenix-hosted TTS service.
Required endpoints
The Phoenix TTS service must implement the same HTTP contract as ElevenLabs for these paths (base path is the app’s /tts or similar; below uses prefix /v1).
1. Sync text-to-speech
- Method:
POST - Path:
/v1/text-to-speech/:voice_id - Headers:
Content-Type: application/jsonAccept: audio/mpeg- Auth: either
xi-api-key: <key>orAuthorization: Bearer <token>(configurable in client)
- Body (JSON):
{ "text": "Hello world", "model_id": "eleven_multilingual_v2", "voice_settings": { "stability": 0.5, "similarity_boost": 0.75, "style": 0, "use_speaker_boost": true } } - Response:
200 OK, body = raw mp3 bytes (audio/mpeg).
2. Streaming text-to-speech
- Method:
POST - Path:
/v1/text-to-speech/:voice_id/stream - Headers: Same as sync.
- Body: Same JSON as sync.
- Response:
200 OK, body = streaming mp3 (same format).
3. Health (recommended)
- Method:
GET - Path:
/health(at same origin as the TTS base URL, e.g.https://phoenix.example.com/tts/healthif base is.../tts/v1) - Response:
200 OK(body optional; used for readiness).
Optional
- Auth: If Phoenix uses a different scheme (e.g. Bearer only), clients set
TTS_AUTH_HEADER_NAME/TTS_AUTH_HEADER_VALUE; no API change. - Visemes: For better lip-sync, a future endpoint could return phoneme/viseme timings; client would call it when available.
Reference
- Virtual-banker TTS client:
virtual-banker/backend/tts(seebackend/tts/README.md). - ElevenLabs TTS API: Text-to-speech, Stream.