62 lines
2.0 KiB
Markdown
62 lines
2.0 KiB
Markdown
# Phoenix TTS API contract (ElevenLabs-compatible)
|
||
|
||
**Last Updated:** 2026-02-10
|
||
**Purpose:** So virtual-banker (and other apps) can “just change endpoint” from ElevenLabs to a Phoenix-hosted TTS service.
|
||
|
||
---
|
||
|
||
## Required endpoints
|
||
|
||
The Phoenix TTS service **must** implement the same HTTP contract as ElevenLabs for these paths (base path is the app’s `/tts` or similar; below uses prefix `/v1`).
|
||
|
||
### 1. Sync text-to-speech
|
||
|
||
- **Method:** `POST`
|
||
- **Path:** `/v1/text-to-speech/:voice_id`
|
||
- **Headers:**
|
||
- `Content-Type: application/json`
|
||
- `Accept: audio/mpeg`
|
||
- Auth: either `xi-api-key: <key>` or `Authorization: Bearer <token>` (configurable in client)
|
||
- **Body (JSON):**
|
||
```json
|
||
{
|
||
"text": "Hello world",
|
||
"model_id": "eleven_multilingual_v2",
|
||
"voice_settings": {
|
||
"stability": 0.5,
|
||
"similarity_boost": 0.75,
|
||
"style": 0,
|
||
"use_speaker_boost": true
|
||
}
|
||
}
|
||
```
|
||
- **Response:** `200 OK`, body = raw **mp3** bytes (`audio/mpeg`).
|
||
|
||
### 2. Streaming text-to-speech
|
||
|
||
- **Method:** `POST`
|
||
- **Path:** `/v1/text-to-speech/:voice_id/stream`
|
||
- **Headers:** Same as sync.
|
||
- **Body:** Same JSON as sync.
|
||
- **Response:** `200 OK`, body = **streaming** mp3 (same format).
|
||
|
||
### 3. Health (recommended)
|
||
|
||
- **Method:** `GET`
|
||
- **Path:** `/health` (at same origin as the TTS base URL, e.g. `https://phoenix.example.com/tts/health` if base is `.../tts/v1`)
|
||
- **Response:** `200 OK` (body optional; used for readiness).
|
||
|
||
---
|
||
|
||
## Optional
|
||
|
||
- **Auth:** If Phoenix uses a different scheme (e.g. Bearer only), clients set `TTS_AUTH_HEADER_NAME` / `TTS_AUTH_HEADER_VALUE`; no API change.
|
||
- **Visemes:** For better lip-sync, a future endpoint could return phoneme/viseme timings; client would call it when available.
|
||
|
||
---
|
||
|
||
## Reference
|
||
|
||
- Virtual-banker TTS client: `virtual-banker/backend/tts` (see `backend/tts/README.md`).
|
||
- ElevenLabs TTS API: [Text-to-speech](https://elevenlabs.io/docs/api-reference/text-to-speech), [Stream](https://elevenlabs.io/docs/api-reference/text-to-speech/stream).
|