Route requests to Claude, GPT, Gemini, Grok, DeepSeek and more through a single endpoint. Use it from our web chat, Claude Code, Codex, or any OpenAI-compatible client.
4 formats
OpenAI · Anthropic · Gemini · Responses
Auto failover
Retries silently
$ export ANTHROPIC_BASE_URL="https://api.kiosai.com"
$ export ANTHROPIC_API_KEY="kios_..."
$ claude # works instantly

One key. Every model.
Drop-in support for all major providers — no vendor lock-in.
Claude
Opus, Sonnet, Haiku
GPT
GPT-5, o1, o3, o4
Gemini
2.5 Pro, Flash
Grok
Grok-4
DeepSeek
V3, R1
Kimi
K2
GLM
4.5
Qwen
3, QwQ
How it works
Sign up, generate a Kios key from settings. Copy it once — it never appears again.
Set your base URL to our gateway. Works with Claude Code, Codex, OpenAI SDK, raw curl — anything.
export ANTHROPIC_BASE_URL="https://api.kiosai.com" export ANTHROPIC_API_KEY="kios_..."
Use any model name you want. We route, retry, and translate behind the scenes. Pay only for what you use.
What we offer
Kios AI handles routing, failover, billing, and observability — so you can ship instead of plumbing.
Speaks Anthropic Messages, OpenAI Chat Completions, Gemini, and the OpenAI Responses API. Translated internally so any client just works.
Drop-in for Claude Code, Codex, and any OpenAI-compatible client. Set two environment variables and you're done — no SDK changes.
Generate images with GPT-Image, DALL-E, Imagen, or Flux. Upload a reference and tell the model what to change — works via multipart upload.
Hundreds of upstream sources behind every model. Rate-limited sources auto-recover after 8 hours. Failed sources get pruned automatically.
Live progress bar with reset countdown right in your settings. Plan defaults with admin overrides. No surprises, no hidden caps.
Conversations saved across reloads and devices. Images, context, and model selection all persist — pick up exactly where you left off.
Server-sent events forwarded with sub-5s keep-alive. Token usage captured mid-stream. No idle timeouts on long generations.
API keys SHA-256 hashed in storage. Source credentials AES-256-GCM encrypted at rest. Session cookies, never tokens in URLs.
Configure `claude-sonnet-4` → tries newest version first, falls back to older. One model name, automatic upgrades when new versions ship.
Pricing
Pick the plan that fits. Upgrade, downgrade, or cancel anytime.
For trying things out
For daily power users
For teams and heavy usage
Prices are placeholders — final tiers and limits will be announced soon.
FAQ
Everything you need to know about Kios AI.
Kios AI is a universal LLM gateway. Instead of juggling separate API keys and SDKs for Claude, GPT, Gemini, and other providers, you get one endpoint and one key that routes to whichever model you ask for.
Any OpenAI-compatible client works out of the box. We also speak Anthropic Messages (so Claude Code works directly), Google Gemini, and the OpenAI Responses API (Codex). Just point your client's base URL at our gateway.
Contact us to get an account, then generate a key from your settings page. Set two environment variables (base URL + API key) and you're live — whether you use Claude Code, Codex, curl, or our web chat.
Requests return a clear 'Kios AI token limit exceeded' message. Your usage resets automatically based on your plan's schedule — you can see a live countdown and progress bar in settings.
We automatically retry the next available source. Rate-limited sources auto-recover after a cooldown. Permanently broken sources get removed. You never see a dead endpoint.
Yes — text-to-image generation works with GPT-Image, DALL-E, and other supported image models. You can also upload a reference image and ask the model to modify it.
API keys are SHA-256 hashed. Source credentials are AES-256-GCM encrypted at rest. Sessions use httpOnly cookies stored in Valkey. We never log prompt content.
Yes. Our web chat at /chat supports any text or image model, persistent conversations, markdown rendering, and streaming responses. No API key management needed in the browser.