Skip to main content
POST /api/v1/chat/completions Content-Type: application/json
Parameter support can differ depending on the model used to generate the response. Check the Model Library for model-specific compatibility. Open Model Library.

Authentication

Send your API key in the Authorization header as a Bearer token.
Authorization: Bearer YOUR_API_KEY

Parameters

Common

NameTypeRequiredDescription
modelstringRequiredThe model ID used to generate the response, like gpt-oss-120b. Find supported models in the Model Library.
messagesarrayRequiredA list of messages comprising the conversation so far. Depending on the model you use, different message types (modalities) are supported, like text, images, and video.

Conditional

The following parameters are not supported by every model. Check the Model Library for model-specific compatibility.

Generation Controls

Common tuning knobs for output length and randomness (availability varies by model).
NameTypeRequiredDescription
temperaturenumberOptionalWhat sampling temperature to use, between 0 and 2 (defaults to 1). Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
max_tokensintegerOptionalThe maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
top_pnumberOptionalAn alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. Defaults to 1. We generally recommend altering this or temperature but not both.
top_kintegerOptionalTop-k sampling is another sampling method where the k most probable next tokens are filtered and the probability mass is redistributed among only those k next tokens. The value of k controls the number of candidates for the next token at each step during text generation. Must be between 0 and 100.
min_pnumberOptionalMinimum probability threshold for token selection. Only tokens with probability >= min_p are considered for selection. This is an alternative to top_p and top_k sampling. Must be between 0 and 1.
repetition_penaltynumberOptionalApplies a penalty to repeated tokens to discourage or encourage repetition. A value of 1.0 means no penalty, allowing free repetition. Values above 1.0 penalize repetition, reducing the likelihood of repeating tokens. Values between 0.0 and 1.0 reward repetition, increasing the chance of repeated tokens. For a good balance, a value of 1.2 is often recommended. Must be between 0 and 100.
reasoning_effortstringOptionalConstrains effort on reasoning for reasoning models. Currently supported values are none, minimal, low, medium, high, and xhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Defaults to medium.
separate_reasoningbooleanOptionalEmit structured reasoning separately from the final answer.
chat_template_kwargs.thinkingbooleanOptionalSet to true to request chain-of-thought output.

Streaming

Receive partial outputs incrementally.
NameTypeRequiredDescription
streambooleanOptionalSet to true to stream output via server-sent events (SSE). Defaults to false

Vision Input

Some models accept mixed text + image inputs by using an array for content on a message.
NameTypeRequiredDescription
messages[].contentstring|arrayOptionalFor vision requests, content can be an array of parts like { type: "video_url" } and { type: "image_url" }.

Examples

Basic request

Send a list of messages and receive a single response.
# Select a model in the Model Library: https://api-web.eigenai.com/model-library

curl -X POST https://api-web.eigenai.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR_MODEL",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 2000,
    "stream": false
  }'

Vision input (text + image)

Send multi-part message content with text and image URLs.
# Select a model in the Model Library: https://api-web.eigenai.com/model-library

curl -X POST https://api-web.eigenai.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR_MODEL",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "Describe the image."},
          {"type": "image_url", "image_url": {"url": "https://example.com/image.png"}}
        ]
      }
    ],
    "max_tokens": 500,
    "stream": false
  }'