Skip to main content

Messages

Create messages using the Anthropic-compatible API format.

Endpoint

POST https://api.scalellm.dev/v1/messages

Examples

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.scalellm.dev",
    api_key="sk_your_key"
)

message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude!"}
    ]
)

print(message.content[0].text)

Request Body

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., claude-sonnet-4.5)
messagesarrayYesArray of message objects
max_tokensintegerYesMaximum tokens to generate
systemstringNoSystem prompt
temperaturenumberNoSampling temperature (0-1)
top_pnumberNoNucleus sampling parameter
top_kintegerNoTop-k sampling parameter
streambooleanNoEnable streaming responses
stop_sequencesarrayNoCustom stop sequences

Message Object

FieldTypeDescription
rolestringEither user or assistant
contentstring or arrayText content or content blocks

Response

{
  "id": "msg_01XYZ...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I help you today?"
    }
  ],
  "model": "claude-sonnet-4.5",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 15
  }
}

Response Fields

FieldTypeDescription
idstringUnique message ID
typestringAlways message
rolestringAlways assistant
contentarrayArray of content blocks
modelstringModel used
stop_reasonstringWhy generation stopped (end_turn, max_tokens, stop_sequence)
usageobjectToken usage statistics

System Prompts

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.scalellm.dev",
    api_key="sk_your_key"
)

message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    system="You are a helpful coding assistant.",
    messages=[
        {"role": "user", "content": "Write a Python hello world."}
    ]
)

print(message.content[0].text)

Multi-turn Conversations

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.scalellm.dev",
    api_key="sk_your_key"
)

message = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is 2+2?"},
        {"role": "assistant", "content": "2+2 equals 4."},
        {"role": "user", "content": "And what is that times 10?"}
    ]
)

print(message.content[0].text)

Streaming

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.scalellm.dev",
    api_key="sk_your_key"
)

with client.messages.stream(
    model="claude-sonnet-4.5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Count to 10."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
Streaming returns Server-Sent Events:
event: message_start
data: {"type":"message_start","message":{"id":"msg_01XYZ...","type":"message","role":"assistant",...}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"1"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":", 2"}}

...

event: message_stop
data: {"type":"message_stop"}

Available Models

ModelDescription
claude-opus-4.5Most capable, best for complex tasks
claude-sonnet-4.5Balanced performance and speed
claude-haiku-4.5Fastest, best for simple tasks

Headers

HeaderRequiredDescription
AuthorizationYesBearer sk_your_key
Content-TypeYesapplication/json
anthropic-versionRecommendedAPI version (e.g., 2023-06-01)