Chat Completions

Create chat completions using the OpenAI-compatible /v1/chat/completions endpoint.

Endpoint

POST https://api.ultramaxo.tech/v1/chat/completions

Request Parameters

ParameterTypeRequiredDescription
modelstringThe model ID to use (e.g. deepseek-v4-flash)
messagesarrayArray of message objects with role and content
streambooleanIf true, responses are streamed via SSE
temperaturenumberSampling temperature (0-2, default: 1)
max_tokensnumberMaximum tokens to generate

Message Roles

RoleDescription
systemSets the behavior and context of the assistant
userThe user's message or question
assistantPrevious assistant responses (for multi-turn)

Example Request

curl https://api.ultramaxo.tech/v1/chat/completions \
  -H "Authorization: Bearer ux_sk_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Streaming

Set stream: true to receive responses as Server-Sent Events (SSE). Each chunk contains a delta of the response.

{
  "model": "deepseek-v4-flash",
  "messages": [{"role": "user", "content": "Hello"}],
  "stream": true
}

Response Format

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "deepseek-v4-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 12,
    "total_tokens": 22
  }
}