Chat Completions
Create chat completions using the OpenAI-compatible /v1/chat/completions endpoint.
Endpoint
POST https://api.ultramaxo.tech/v1/chat/completions
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✓ | The model ID to use (e.g. deepseek-v4-flash) |
messages | array | ✓ | Array of message objects with role and content |
stream | boolean | — | If true, responses are streamed via SSE |
temperature | number | — | Sampling temperature (0-2, default: 1) |
max_tokens | number | — | Maximum tokens to generate |
Message Roles
| Role | Description |
|---|---|
system | Sets the behavior and context of the assistant |
user | The user's message or question |
assistant | Previous assistant responses (for multi-turn) |
Example Request
curl https://api.ultramaxo.tech/v1/chat/completions \
-H "Authorization: Bearer ux_sk_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"max_tokens": 1024
}'Streaming
Set stream: true to receive responses as Server-Sent Events (SSE). Each chunk contains a delta of the response.
{
"model": "deepseek-v4-flash",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}Response Format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "deepseek-v4-flash",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 12,
"total_tokens": 22
}
}