Skip to content

Batch Settings

When creating a new batch, you can configure several settings. Most of them cover the selection of your model, files, and prompt. This page explains the queue and dispatcher settings.

Model settings are hyperparameters passed directly to the LLM client. They control how the model generates output. If you are unfamiliar with LLM hyperparameters, refer to your provider’s documentation.

SettingExplanation
max_task_per_minuteThe maximum number of API requests sent per minute. Set this to match your provider’s RPM limit.
max_parallel_tasksThe number of API requests that can run at the same time. In most cases, keep this at 1.
max_retriesThe number of times the entire batch is retried after a failure. Once this limit is reached, the batch is marked as failed.
retries_per_failed_taskThe number of times a single task is retried before it is marked as failed.
queue_batchIf set to true, the batch waits until all other running batches on the same endpoint finish before starting. If set to false, the batch starts immediately. Running batches in parallel increases the total RPM on the endpoint, which is likely to cause rate limit errors on commercial providers.