Skip to main content
WSS /api/v1/generate/ws The WebSocket endpoint delivers generated audio as a stream of binary PCM chunks, enabling lower-latency playback compared to the HTTP endpoint. Supported models: higgs2p5, chatterbox, qwen3-tts

Protocol

The WebSocket session follows a 3-message handshake:
1

Connect

Open a WebSocket connection to wss://api-web.eigenai.com/api/v1/generate/ws.
2

Authenticate

Send a JSON auth message immediately after connecting:
{
  "token": "YOUR_API_KEY",
  "model": "higgs2p5"
}
3

Send TTS request

Send a JSON message with your synthesis parameters:
{
  "text": "Hello, streaming audio world!",
  "voice": "Linda"
}
The server then sends:
  • Binary frames — raw PCM audio chunks (16-bit, 24 kHz, mono)
  • {"type": "complete"} — JSON frame signaling end of stream

Parameters

Parameters in the TTS request JSON match those of the HTTP endpoint for each model. See Generate Audio for the full parameter list per model.
ModelKey parameters
higgs2p5text, voice, voice_id, voice_url, voice_settings, sampling
chatterboxtext, language_id, voice_id, audio_prompt_file, exaggeration, temperature
qwen3-ttstext, voice, voice_id, voice_url, language, instructions, voice_settings

Examples

import asyncio
import websockets
import json

API_KEY = "YOUR_API_KEY"
WS_URL = "wss://api-web.eigenai.com/api/v1/generate/ws"

async def stream_audio():
    async with websockets.connect(WS_URL) as ws:
        # Step 1: Authenticate
        await ws.send(json.dumps({"token": API_KEY, "model": "higgs2p5"}))

        # Step 2: Send TTS request
        await ws.send(json.dumps({"text": "Hello, streaming audio world!", "voice": "Linda"}))

        # Step 3: Receive audio chunks
        with open("output.pcm", "wb") as f:
            async for message in ws:
                if isinstance(message, bytes):
                    f.write(message)
                else:
                    data = json.loads(message)
                    if data.get("type") == "complete":
                        print("Stream complete")
                        break

asyncio.run(stream_audio())
The binary frames contain raw PCM audio: 16-bit signed integers, 24 kHz sample rate, mono channel. Use a library like soundfile (Python) or AudioContext (browser) to decode and play.