API Reference

Cutad API Docs

Complete reference for the Cutad AI API Gateway. OpenAI-compatible endpoints, streaming support, and production-ready infrastructure.

Authentication

All API requests require a Bearer token in the Authorization header. Obtain your API key from the dashboard.

HTTP Header auth
Note: API keys are prefixed with cutad-. Keep your key secret — never expose it in client-side code or public repositories.

Error Response (Invalid Key)

JSON401
{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Base URL

Base URLendpoint
https://mimo.lokerin.net

All endpoints are relative to this base URL. The API is OpenAI-compatible — use the OpenAI SDK with a custom base_url.

Chat Completions

POST /v1/chat/completions

Create a chat completion. Supports both standard and streaming responses.

Request Body

ParameterTypeRequiredDescription
model string required Model ID: cutad-agent or cutad-agent-pro
messages array required Array of message objects with role and content
stream boolean optional Enable streaming responses. Default: false
temperature float optional Sampling temperature, 0–2. Default: 1
max_tokens integer optional Maximum tokens in the response
top_p float optional Nucleus sampling, 0–1. Default: 1
stop string|array optional Stop sequences

Message Object

FieldTypeDescription
role string system, user, or assistant
content string The message content

List Models

GET /v1/models

Returns a list of available models.

Response

JSON — 200 OKresponse
{
  "object": "list",
  "data": [
    { "id": "cutad-agent", "object": "model", "owned_by": "cutad" },
    { "id": "cutad-agent-pro", "object": "model", "owned_by": "cutad" }
  ]
}

Streaming

Set "stream": true in your request to receive Server-Sent Events (SSE). Each chunk is a line prefixed with data: .

Stream Chunk Format

SSEstream
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",
      "choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",
      "choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",
      "choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
Tip: The stream ends with data: [DONE]. Always check for this sentinel value when consuming the stream.

Available Models

Model IDDescriptionRate LimitStreaming
cutad-agent General-purpose AI agent. Fast responses, reliable output. 15 req/min Yes
cutad-agent-pro Enhanced reasoning, complex tasks, longer context. Recommended for production. 20 req/min Yes

Request Format

Send JSON requests with Content-Type: application/json.

HTTP Requestformat

Response Format

Successful responses return JSON in the OpenAI-compatible format.

JSON — 200 OKresponse
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "cutad-agent",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses qubits..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 64,
    "total_tokens": 89
  }
}

Error Codes

The API uses standard HTTP status codes. Error responses include a JSON body with details.

400
Bad Request — Invalid parameters or malformed JSON
401
Unauthorized — Missing or invalid API key
403
Forbidden — Key lacks permission for this resource
404
Not Found — Endpoint or model does not exist
429
Rate Limited — Request queued, waiting for available slot
500
Internal Error — Server-side issue, retry later

Error Response Format

JSON — Errorerror
{
  "error": {
    "message": "Human-readable error description",
    "type": "error_type",
    "code": "error_code"
  }
}

Common Error Types

TypeCodeDescription
authentication_errorinvalid_api_keyAPI key is invalid or expired
invalid_request_errormissing_modelNo model specified in the request
invalid_request_errorinvalid_modelSpecified model does not exist
rate_limit_errorrate_limitedRequest queued — waiting for available slot
server_errorinternal_errorUnexpected server-side failure

Rate Limits

Rate limits are applied per API key, per model. When the limit is reached, requests are automatically queued and processed once a slot becomes available (max wait: 2 minutes).

cutad-agent
15
requests / minute
cutad-agent-pro
20
requests / minute
Auto-Queue: When you exceed the rate limit, your request is automatically queued. It will be processed as soon as a slot opens up (within 60 seconds). No need to implement retry logic — just wait for the response. If the queue is full for over 2 minutes, you'll receive a 503 timeout error.

Rate Limit Headers

HeaderDescription
X-RateLimit-LimitMaximum requests allowed per window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets
Retry-AfterSeconds until slot available (on 503 timeout)

Code Examples

Basic Request

cURLbash

Streaming Request

cURL — Streamingbash

List Models

cURLbash

Using the OpenAI SDK

Pythonpython

Streaming with Python

Python — Streamingpython

Error Handling

Python — Errorspython

Fetch API

JavaScriptjavascript

Streaming with JavaScript

JavaScript — SSEjavascript

Node.js SDK

Node.jsjavascript