Documentation Index
Fetch the complete documentation index at: https://docs.eigenai.com/llms.txt
Use this file to discover all available pages before exploring further.
POST /api/v1/chat/completions
Content-Type: application/json
Parameter support can differ depending on the model used to generate the response. Check the Model Library for model-specific compatibility. Open Model Library.
Authentication
Send your API key in the Authorization header as a Bearer token.
Authorization: Bearer YOUR_API_KEY
Parameters
Common
| Name | Type | Required | Description |
|---|
model | string | Required | The model ID used to generate the response, like gpt-oss-120b. Find supported models in the Model Library. |
messages | array | Required | A list of messages comprising the conversation so far. Depending on the model you use, different message types (modalities) are supported, like text, images, and video. |
Conditional
The following parameters are not supported by every model. Check the Model Library for model-specific compatibility.
Generation Controls
Common tuning knobs for output length and randomness (availability varies by model).
| Name | Type | Required | Description |
|---|
temperature | number | Optional | What sampling temperature to use, between 0 and 2 (defaults to 1). Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
max_tokens | integer | Optional | The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API. |
top_p | number | Optional | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. Defaults to 1. We generally recommend altering this or temperature but not both. |
top_k | integer | Optional | Top-k sampling is another sampling method where the k most probable next tokens are filtered and the probability mass is redistributed among only those k next tokens. The value of k controls the number of candidates for the next token at each step during text generation. Must be between 0 and 100. |
min_p | number | Optional | Minimum probability threshold for token selection. Only tokens with probability >= min_p are considered for selection. This is an alternative to top_p and top_k sampling. Must be between 0 and 1. |
repetition_penalty | number | Optional | Applies a penalty to repeated tokens to discourage or encourage repetition. A value of 1.0 means no penalty, allowing free repetition. Values above 1.0 penalize repetition, reducing the likelihood of repeating tokens. Values between 0.0 and 1.0 reward repetition, increasing the chance of repeated tokens. For a good balance, a value of 1.2 is often recommended. Must be between 0 and 100. |
reasoning_effort | string | Optional | Constrains effort on reasoning for reasoning models. Currently supported values are none, minimal, low, medium, high, and xhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Defaults to medium. |
separate_reasoning | boolean | Optional | Emit structured reasoning separately from the final answer. |
chat_template_kwargs.thinking | boolean | Optional | Set to true to request chain-of-thought output. |
chat_template_kwargs.enable_thinking | boolean | Optional | Enable or disable chain-of-thought thinking mode. Defaults to true. |
Streaming
Receive partial outputs incrementally.
| Name | Type | Required | Description |
|---|
stream | boolean | Optional | Set to true to stream output via server-sent events (SSE). Defaults to false |
Some models accept mixed text + image inputs by using an array for content on a message.
| Name | Type | Required | Description |
|---|
messages[].content | string|array | Optional | For vision requests, content can be an array of parts like { type: "video_url" } and { type: "image_url" }. |
Examples
Basic request
Send a list of messages and receive a single response.
# Select a model in the Model Library: https://api-web.eigenai.com/model-library
curl -X POST https://api-web.eigenai.com/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "YOUR_MODEL",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"max_tokens": 2000,
"stream": false
}'
Vision input (text + image)
Send multi-part message content with text and image URLs.
# Select a model in the Model Library: https://api-web.eigenai.com/model-library
curl -X POST https://api-web.eigenai.com/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "YOUR_MODEL",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe the image."},
{"type": "image_url", "image_url": {"url": "https://example.com/image.png"}}
]
}
],
"max_tokens": 500,
"stream": false
}'