Skip to main content
POST /api/v1/generate Content-Type: multipart/form-data
Parameter support can differ depending on the model used. Check the Model Library for model-specific compatibility. Open Model Library.

Authentication

Send your API key in the Authorization header as a Bearer token.
Authorization: Bearer YOUR_API_KEY

Audio Transcription (ASR)

Supported model: Whisper V3 Turbo (model=whisper_v3_turbo)

Parameters

NameTypeRequiredDescription
modelstringRequiredMust be whisper_v3_turbo.
filefileRequiredAudio file to transcribe (MP3, WAV, M4A, OGG, WebM).
languagestringOptionalSpoken language code (default en). Supports 99 languages.
response_formatstringOptionalOutput format: json or text.

Example

curl -X POST https://api-web.eigenai.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "model=whisper_v3_turbo" \
  -F "file=@/path/to/audio.mp3" \
  -F "language=en" \
  -F "response_format=json"

Text-to-Speech (TTS)

Three models are available. All accept multipart/form-data and return a WAV audio file by default. For real-time streaming over WebSocket, see Stream Audio. To upload a voice reference for cloning, see Upload Voice Reference.

Higgs Audio V2.5 (model=higgs2p5)

NameTypeRequiredDescription
modelstringRequiredMust be higgs2p5.
textstringRequiredText to convert to speech.
voicestringOptionalVoice preset (e.g. Linda, Jack).
voice_reference_filefileOptionalAudio file for voice cloning (WAV, MP3).
voice_idstringOptionalStored voice ID returned by Upload Voice Reference.
voice_urlstringOptionalExternal URL to a voice reference audio sample.
voice_namestringOptionalName of a saved voice from the voice library.
voice_settingsstringOptionalJSON string with voice settings. Supports speed (default 1.0).
samplingstringOptionalJSON string with sampling controls: temperature (default 1.0), top_p (default 0.95), top_k (default 50).
streambooleanOptionalfalse = return WAV file (default); true = HTTP SSE streaming.

ChatterBox Voice Twin (model=chatterbox)

NameTypeRequiredDescription
modelstringRequiredMust be chatterbox.
textstringRequiredText to convert to speech (≤ 1,000 characters recommended).
language_idstringOptionalLanguage code (e.g. en, zh, es, ja). Supports 23 languages. Default en.
audio_prompt_filefileOptionalVoice reference clip for voice cloning (WAV/MP3/M4A/OGG, max 30s).
voice_idstringOptionalStored voice ID returned by Upload Voice Reference.
preset_urlstringOptionalURL to a voice preset audio sample.
exaggerationnumberOptionalExpressiveness: 0.0 = subtle, 0.5 = balanced, 1.0+ = highly animated (default 0.5).
temperaturenumberOptionalSampling temperature (default 0.8).
diffusion_stepsnumberOptionalQuality vs. latency. Higher = better quality, slower (default 5).
max_tokensintegerOptionalUpper bound on generated tokens (default 3000).
top_pnumberOptionalNucleus sampling ceiling (default 1.0).
min_pnumberOptionalNucleus sampling floor (default 0.05).
repetition_penaltynumberOptionalPenalizes repeated tokens (default 1.2).
seedintegerOptionalSeed for reproducible generation (null = random).
streambooleanOptionalfalse = return WAV file (default); true = HTTP SSE streaming.

Qwen3 TTS (model=qwen3-tts)

Supports named speakers (CustomVoice mode) or voice cloning (Base mode). voice and voice_id/voice_url cannot be used together.
NameTypeRequiredDescription
modelstringRequiredMust be qwen3-tts.
textstringRequiredText to synthesize.
voicestringOptionalNamed speaker for CustomVoice mode: Vivian, Serena, Uncle_Fu, Dylan, Eric, Ryan, Aiden, Ono_Anna, Sohee. Cannot be used with voice_id or voice_url.
voice_idstringOptionalStored voice ID for Base model (from Upload Voice Reference).
voice_urlstringOptionalExternal URL to voice reference audio (Base model).
voice_settingsstringOptionalJSON string with voice settings. Supports speed (default 1.0).
languagestringOptionalAuto, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish (default Auto).
instructionsstringOptionalStyle/emotion control (e.g. "speak cheerfully").
response_formatstringOptionalOutput format: wav (default), pcm, mp3, flac, aac, opus.
streambooleanOptionalfalse = return audio file (default); true = HTTP SSE streaming.

TTS Example

curl -X POST https://api-web.eigenai.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "model=higgs2p5" \
  -F "text=Hello, this is a test of the text-to-speech system." \
  -F "voice=Linda" \
  --output speech.wav