API Reference

Cutad API Docs

Complete reference for the Cutad AI API Gateway. OpenAI-compatible endpoints, streaming support, and production-ready infrastructure.

Authentication

All API requests require a Bearer token in the Authorization header. Obtain your API key from the dashboard.

HTTP Header auth

Note: API keys are prefixed with cutad-. Keep your key secret — never expose it in client-side code or public repositories.

Error Response (Invalid Key)

JSON401

{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Base URL

Base URLendpoint

https://mimo.lokerin.net

All endpoints are relative to this base URL. The API is OpenAI-compatible — use the OpenAI SDK with a custom base_url.

Chat Completions

POST /v1/chat/completions

Create a chat completion. Supports both standard and streaming responses.

Request Body

Parameter	Type	Required	Description
model	string	required	Model ID: `cutad-agent` or `cutad-agent-pro`
messages	array	required	Array of message objects with `role` and `content`
stream	boolean	optional	Enable streaming responses. Default: `false`
temperature	float	optional	Sampling temperature, 0–2. Default: `1`
max_tokens	integer	optional	Maximum tokens in the response
top_p	float	optional	Nucleus sampling, 0–1. Default: `1`
stop	string\|array	optional	Stop sequences

Message Object

Field	Type	Description
role	string	`system`, `user`, or `assistant`
content	string	The message content

List Models

GET /v1/models

Returns a list of available models.

Response

JSON — 200 OKresponse

{
  "object": "list",
  "data": [
    { "id": "cutad-agent", "object": "model", "owned_by": "cutad" },
    { "id": "cutad-agent-pro", "object": "model", "owned_by": "cutad" }
  ]
}

Streaming

Set "stream": true in your request to receive Server-Sent Events (SSE). Each chunk is a line prefixed with data: .

Stream Chunk Format

SSEstream

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",
      "choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",
      "choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",
      "choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Tip: The stream ends with data: [DONE]. Always check for this sentinel value when consuming the stream.

Available Models

Model ID	Description	Rate Limit	Streaming
cutad-agent	General-purpose AI agent. Fast responses, reliable output.	15 req/min	Yes
cutad-agent-pro	Enhanced reasoning, complex tasks, longer context. Recommended for production.	20 req/min	Yes

Request Format

Send JSON requests with Content-Type: application/json.

HTTP Requestformat

Response Format

Successful responses return JSON in the OpenAI-compatible format.

JSON — 200 OKresponse

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "cutad-agent",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses qubits..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 64,
    "total_tokens": 89
  }
}

Error Codes

The API uses standard HTTP status codes. Error responses include a JSON body with details.

400

Bad Request — Invalid parameters or malformed JSON

401

Unauthorized — Missing or invalid API key

403

Forbidden — Key lacks permission for this resource

404

Not Found — Endpoint or model does not exist

429

Rate Limited — Request queued, waiting for available slot

500

Internal Error — Server-side issue, retry later

Error Response Format

JSON — Errorerror

{
  "error": {
    "message": "Human-readable error description",
    "type": "error_type",
    "code": "error_code"
  }
}

Common Error Types

Type	Code	Description
authentication_error	`invalid_api_key`	API key is invalid or expired
invalid_request_error	`missing_model`	No model specified in the request
invalid_request_error	`invalid_model`	Specified model does not exist
rate_limit_error	`rate_limited`	Request queued — waiting for available slot
server_error	`internal_error`	Unexpected server-side failure

Rate Limits

Rate limits are applied per API key, per model. When the limit is reached, requests are automatically queued and processed once a slot becomes available (max wait: 2 minutes).

cutad-agent

requests / minute

cutad-agent-pro

requests / minute

Auto-Queue: When you exceed the rate limit, your request is automatically queued. It will be processed as soon as a slot opens up (within 60 seconds). No need to implement retry logic — just wait for the response. If the queue is full for over 2 minutes, you'll receive a 503 timeout error.

Rate Limit Headers

Header	Description
X-RateLimit-Limit	Maximum requests allowed per window
X-RateLimit-Remaining	Requests remaining in current window
X-RateLimit-Reset	Unix timestamp when the window resets
Retry-After	Seconds until slot available (on 503 timeout)

Code Examples

Basic Request

cURLbash

Streaming Request

cURL — Streamingbash

List Models

cURLbash

Using the OpenAI SDK

Pythonpython

Streaming with Python

Python — Streamingpython

Error Handling

Python — Errorspython

Fetch API

JavaScriptjavascript

Streaming with JavaScript

JavaScript — SSEjavascript

Node.js SDK

Node.jsjavascript