Skip to main content

Error Handling

ScaleLLM returns standard HTTP error codes with detailed error messages.

Error Response Format

{
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid model specified: gpt-4",
    "code": "model_not_found"
  }
}

Error Codes

StatusTypeDescription
400invalid_request_errorMalformed request
401authentication_errorInvalid API key
403permission_errorKey lacks permission
404not_found_errorModel or endpoint not found
429rate_limit_errorRate limit exceeded
500server_errorInternal server error
503service_unavailableProvider temporarily unavailable

Common Errors

Invalid API Key (401)

{
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key provided"
  }
}
Fix: Check your API key starts with sk_ and is active.

Model Not Found (404)

{
  "error": {
    "type": "not_found_error",
    "message": "Model 'gpt-4' not found",
    "code": "model_not_found"
  }
}
Fix: Use a valid model name like claude-sonnet-4.5.

Rate Limit (429)

{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded"
  }
}
Fix: Wait and retry with exponential backoff.

Handling Errors

from openai import OpenAI, APIError, RateLimitError, AuthenticationError

client = OpenAI(
    base_url="https://api.scalellm.dev/v1",
    api_key="sk_your_key"
)

try:
    response = client.chat.completions.create(
        model="claude-sonnet-4.5",
        messages=[{"role": "user", "content": "Hello"}]
    )
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limited - retry later")
except APIError as e:
    print(f"API error: {e.message}")

Using Fallbacks

Configure fallback models to handle provider errors automatically:
curl https://api.scalellm.dev/v1/chat/completions \
  -H "Authorization: Bearer sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4.5",
    "messages": [{"role": "user", "content": "Hello"}],
    "fallback_models": ["gemini-3-pro-preview"]
  }'
If Claude is unavailable, ScaleLLM automatically tries Gemini.