Chat

Generate text using Google’s native Gemini API format.

Endpoint

POST https://api.scalellm.dev/v1beta/models/{model}:generateContent

Examples

Python
JavaScript
cURL

import google.generativeai as genai

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel("gemini-3-pro-preview")

response = model.generate_content("Hello, how are you?")

print(response.text)

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('sk_your_key', {
  baseUrl: 'https://api.scalellm.dev'
});

const model = genAI.getGenerativeModel({ model: 'gemini-3-pro-preview' });

const result = await model.generateContent('Hello, how are you?');

console.log(result.response.text());

curl https://api.scalellm.dev/v1beta/models/gemini-3-pro-preview:generateContent \
  -H "x-goog-api-key: sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Hello, how are you?"}
      ]
    }]
  }'

Request Body

Parameter	Type	Required	Description
`contents`	array	Yes	Array of content objects
`generationConfig`	object	No	Generation configuration
`systemInstruction`	object	No	System instruction for the model
`safetySettings`	array	No	Safety filter settings

Content Object

Field	Type	Description
`role`	string	`user` or `model` (optional for single turn)
`parts`	array	Array of part objects (text, image, etc.)

Part Object

Field	Type	Description
`text`	string	Text content
`inlineData`	object	Base64 encoded media data

Generation Config

Field	Type	Description
`temperature`	number	Sampling temperature (0-2)
`topP`	number	Nucleus sampling parameter
`topK`	integer	Top-k sampling parameter
`maxOutputTokens`	integer	Maximum tokens to generate
`stopSequences`	array	Stop sequences

Response

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Hello! I'm doing well, thank you for asking. How can I help you today?"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 6,
    "candidatesTokenCount": 18,
    "totalTokenCount": 24
  }
}

System Instructions

Python
JavaScript
cURL

import google.generativeai as genai

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel(
    "gemini-3-pro-preview",
    system_instruction="You are a helpful coding assistant."
)

response = model.generate_content("Write a Python hello world.")

print(response.text)

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('sk_your_key', {
  baseUrl: 'https://api.scalellm.dev'
});

const model = genAI.getGenerativeModel({
  model: 'gemini-3-pro-preview',
  systemInstruction: 'You are a helpful coding assistant.'
});

const result = await model.generateContent('Write a Python hello world.');

console.log(result.response.text());

curl https://api.scalellm.dev/v1beta/models/gemini-3-pro-preview:generateContent \
  -H "x-goog-api-key: sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "systemInstruction": {
      "parts": [
        {"text": "You are a helpful coding assistant."}
      ]
    },
    "contents": [{
      "parts": [
        {"text": "Write a Python hello world."}
      ]
    }]
  }'

Multi-turn Conversations

Python
JavaScript
cURL

import google.generativeai as genai

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel("gemini-3-pro-preview")
chat = model.start_chat()

response = chat.send_message("What is 2+2?")
print(response.text)

response = chat.send_message("And what is that times 10?")
print(response.text)

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('sk_your_key', {
  baseUrl: 'https://api.scalellm.dev'
});

const model = genAI.getGenerativeModel({ model: 'gemini-3-pro-preview' });
const chat = model.startChat();

let result = await chat.sendMessage('What is 2+2?');
console.log(result.response.text());

result = await chat.sendMessage('And what is that times 10?');
console.log(result.response.text());

curl https://api.scalellm.dev/v1beta/models/gemini-3-pro-preview:generateContent \
  -H "x-goog-api-key: sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is 2+2?"}]
      },
      {
        "role": "model",
        "parts": [{"text": "2+2 equals 4."}]
      },
      {
        "role": "user",
        "parts": [{"text": "And what is that times 10?"}]
      }
    ]
  }'

Vision (Image Input)

Python
JavaScript
cURL

import google.generativeai as genai
import PIL.Image

genai.configure(
    api_key="sk_your_key",
    transport="rest",
    client_options={"api_endpoint": "api.scalellm.dev"}
)

model = genai.GenerativeModel("gemini-3-pro-preview")

image = PIL.Image.open("image.jpg")
response = model.generate_content(["What is in this image?", image])

print(response.text)

import { GoogleGenerativeAI } from '@google/generative-ai';
import * as fs from 'fs';

const genAI = new GoogleGenerativeAI('sk_your_key', {
  baseUrl: 'https://api.scalellm.dev'
});

const model = genAI.getGenerativeModel({ model: 'gemini-3-pro-preview' });

const imageData = fs.readFileSync('image.jpg').toString('base64');

const result = await model.generateContent([
  'What is in this image?',
  { inlineData: { mimeType: 'image/jpeg', data: imageData } }
]);

console.log(result.response.text());

curl https://api.scalellm.dev/v1beta/models/gemini-3-pro-preview:generateContent \
  -H "x-goog-api-key: sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "What is in this image?"},
        {
          "inlineData": {
            "mimeType": "image/jpeg",
            "data": "'$(base64 -i image.jpg)'"
          }
        }
      ]
    }]
  }'

Available Models

Model	Description
`gemini-3-pro-preview`	Latest Gemini with 1M token context
`gemini-3-flash`	Ultra-fast and cost-effective

Headers

Header	Required	Description
`x-goog-api-key`	Yes	Your ScaleLLM API key (`sk_your_key`)
`Content-Type`	Yes	`application/json`

OpenAI Compatible

Anthropic Compatible

Google Compatible

Reference

Chat

Chat

Endpoint

Examples

Request Body

Content Object

Part Object

Generation Config

Response

System Instructions

Multi-turn Conversations

Vision (Image Input)

Available Models

Headers

OpenAI Compatible

Anthropic Compatible

Google Compatible

Reference

​Chat

​Endpoint

​Examples

​Request Body

​Content Object

​Part Object

​Generation Config

​Response

​System Instructions

​Multi-turn Conversations

​Vision (Image Input)

​Available Models

​Headers

Chat

Endpoint

Examples

Request Body

Content Object

Part Object

Generation Config

Response

System Instructions

Multi-turn Conversations

Vision (Image Input)

Available Models

Headers