Skip to main content

Chat

Generate text using Google’s native Gemini API format.

Endpoint

POST https://api.scalellm.dev/v1beta/models/{model}:generateContent

Examples

import google.generativeai as genai

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel("gemini-3-pro-preview")

response = model.generate_content("Hello, how are you?")

print(response.text)

Request Body

ParameterTypeRequiredDescription
contentsarrayYesArray of content objects
generationConfigobjectNoGeneration configuration
systemInstructionobjectNoSystem instruction for the model
safetySettingsarrayNoSafety filter settings

Content Object

FieldTypeDescription
rolestringuser or model (optional for single turn)
partsarrayArray of part objects (text, image, etc.)

Part Object

FieldTypeDescription
textstringText content
inlineDataobjectBase64 encoded media data

Generation Config

FieldTypeDescription
temperaturenumberSampling temperature (0-2)
topPnumberNucleus sampling parameter
topKintegerTop-k sampling parameter
maxOutputTokensintegerMaximum tokens to generate
stopSequencesarrayStop sequences

Response

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Hello! I'm doing well, thank you for asking. How can I help you today?"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 6,
    "candidatesTokenCount": 18,
    "totalTokenCount": 24
  }
}

System Instructions

import google.generativeai as genai

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel(
    "gemini-3-pro-preview",
    system_instruction="You are a helpful coding assistant."
)

response = model.generate_content("Write a Python hello world.")

print(response.text)

Multi-turn Conversations

import google.generativeai as genai

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel("gemini-3-pro-preview")
chat = model.start_chat()

response = chat.send_message("What is 2+2?")
print(response.text)

response = chat.send_message("And what is that times 10?")
print(response.text)

Vision (Image Input)

import google.generativeai as genai
import PIL.Image

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel("gemini-3-pro-preview")

image = PIL.Image.open("image.jpg")
response = model.generate_content(["What is in this image?", image])

print(response.text)

Available Models

ModelDescription
gemini-3-pro-previewLatest Gemini with 1M token context
gemini-3-flashUltra-fast and cost-effective

Headers

HeaderRequiredDescription
x-goog-api-keyYesYour ScaleLLM API key (sk_your_key)
Content-TypeYesapplication/json