Documentation Index
Fetch the complete documentation index at: https://docs.eigenai.com/llms.txt
Use this file to discover all available pages before exploring further.
The Fine-tuning page lets you train a base model on your own dataset and deploy the resulting checkpoints as inference endpoints. Three training modes are available: SFT, Image Editing, and Agent RL (coming soon).
The jobs list
The Fine-tuning page shows a table of all your training jobs.
| Column | Description |
|---|
| Fine-tuning jobs | Job name and ID. Click the copy icon to copy the job ID. |
| Status | Current state: Queued, Running, Completed, or Failed. |
| Base model | The model used as the starting point for training. |
| Dataset | The training dataset filename and ID. |
| Create time | When the job was submitted. |
| Actions | Additional controls when available (e.g., cancel a running job). |
Use the Search box to filter by job name, ID, dataset, or creator. Use the Training types and Status dropdowns to narrow the list further.
Job details
Click any job in the list to open its detail page.
Configuration
Shows the full configuration used for the job:
- Status, Training mode, Base model, Training dataset, Evaluation dataset
- Batch size, Learning rate, Epochs, Queue position
- Created / Started / Completed timestamps
Training metrics
Real-time charts are updated as training progresses.
| Chart | Description |
|---|
| Loss | Training loss over steps. Includes min/max values, data point count, and EMA smoothing control. |
| Gradient Norm | Gradient norm over steps. |
| Learning Rate | Learning rate schedule over steps. |
The latest values are shown as a summary line above each chart (e.g., loss 0.2466 • grad 1.591 • lr 2.00e-6 @ step 124). Source data comes from training.log.
Model checkpoints
After training completes, EigenAI saves one checkpoint per epoch in HuggingFace format.
| Field | Description |
|---|
| Epoch N | Checkpoint label (e.g., Epoch 1 through Epoch 5). |
| HuggingFace | Format of the saved weights. |
| Files / Size / Step | Number of files, total size, and the training step at which the checkpoint was saved. |
Each checkpoint has two buttons:
- Details — View the full list of files in the checkpoint.
- Deploy — Create an inference deployment directly from this checkpoint to test the training results. See Deployments for details.
Additional files
| File | Description |
|---|
checkpoint_status.json | Metadata about checkpoint state. |
training.log | Full training log file. |
Click Download next to either file to save it locally.
Logs
The Logs section displays the last 200 lines of real-time training output. Click Refresh to fetch the latest lines.