Skip to main content
The Fine-tuning page lets you train a base model on your own dataset and deploy the resulting checkpoints as inference endpoints. Three training modes are available: SFT, Image Editing, and Agent RL (coming soon).

The jobs list

The Fine-tuning page shows a table of all your training jobs.
ColumnDescription
Fine-tuning jobsJob name and ID. Click the copy icon to copy the job ID.
StatusCurrent state: Queued, Running, Completed, or Failed.
Base modelThe model used as the starting point for training.
DatasetThe training dataset filename and ID.
Create timeWhen the job was submitted.
ActionsAdditional controls when available (e.g., cancel a running job).
Use the Search box to filter by job name, ID, dataset, or creator. Use the Training types and Status dropdowns to narrow the list further.

Job details

Click any job in the list to open its detail page.

Configuration

Shows the full configuration used for the job:
  • Status, Training mode, Base model, Training dataset, Evaluation dataset
  • Batch size, Learning rate, Epochs, Queue position
  • Created / Started / Completed timestamps

Training metrics

Real-time charts are updated as training progresses.
ChartDescription
LossTraining loss over steps. Includes min/max values, data point count, and EMA smoothing control.
Gradient NormGradient norm over steps.
Learning RateLearning rate schedule over steps.
The latest values are shown as a summary line above each chart (e.g., loss 0.2466 • grad 1.591 • lr 2.00e-6 @ step 124). Source data comes from training.log.

Model checkpoints

After training completes, EigenAI saves one checkpoint per epoch in HuggingFace format.
FieldDescription
Epoch NCheckpoint label (e.g., Epoch 1 through Epoch 5).
HuggingFaceFormat of the saved weights.
Files / Size / StepNumber of files, total size, and the training step at which the checkpoint was saved.
Each checkpoint has two buttons:
  • Details — View the full list of files in the checkpoint.
  • Deploy — Create an inference deployment directly from this checkpoint to test the training results. See Deployments for details.

Additional files

FileDescription
checkpoint_status.jsonMetadata about checkpoint state.
training.logFull training log file.
Click Download next to either file to save it locally.

Logs

The Logs section displays the last 200 lines of real-time training output. Click Refresh to fetch the latest lines.