Rate limits
Per-key and per-workspace limits, how to read them, and how to back off cleanly.
GenieOS enforces two layers of rate limits:
- Per-key: a token-bucket on the API key. Defaults to 30 requests per second with a burst of 60, regardless of plan.
- Per-workspace: a sliding window across all keys in the workspace. The default cap is 600 requests per minute, lifted by plan tier.
Both layers return 429 Too Many Requests when tripped. The headers tell
you exactly where you are.
The headers
Every authenticated response carries:
X-RateLimit-Limit: 30
X-RateLimit-Remaining: 24
X-RateLimit-Reset: 2026-04-20T12:34:00Z
X-RateLimit-Scope: keyWhen you trip a limit:
HTTP/1.1 429 Too Many Requests
Retry-After: 0.6
X-RateLimit-Limit: 30
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-04-20T12:34:00Z
X-RateLimit-Scope: key
Content-Type: application/json
{
"error": {
"type": "rate_limit_exceeded",
"code": "key_rate_limit",
"message": "API key exceeded its rate limit. Retry after 0.6s.",
"retry_after_seconds": 0.6,
"request_id": "req_01JABC..."
}
}The X-RateLimit-Scope header is key or workspace so you know which
layer pushed back.
What the SDKs do
Both SDKs:
- Honour
Retry-Afterexactly. No fixed-step backoff that ignores the header and hammers a recovering tier. - Apply jittered exponential backoff (250ms / 500ms / 1s / 2s / 4s) for network / 5xx errors.
- Cap at 5 attempts by default; configurable.
import { GenieOS } from 'genieos';
const mg = new GenieOS({
// Defaults shown.
maxRetries: 5,
initialBackoffMs: 250,
});from genieos import GenieOS
mg = GenieOS(max_retries=5, initial_backoff_seconds=0.25)Sending bursts safely
If you have a one-off batch — daily digest at 09:00, post-deploy
announcement, founder-update blast — use the batch endpoint rather
than a tight for loop hitting /transactional/send 50,000 times:
POST /v1/transactional/batchThe batch endpoint:
- Accepts up to 1,000 recipients per call.
- Counts as a single rate-limit hit per request, not per recipient.
- Has its own per-workspace concurrency limit (4 in flight by default).
See Transactional for the request shape.
Plan limits
| Plan | Per-key (req/s) | Per-workspace (req/min) | Notes |
|---|---|---|---|
| Sandbox | 60 | unlimited | Sandbox keys are not rate-limited |
| Hobby | 30 | 600 | |
| Pro | 60 | 1,800 | |
| Scale | 120 | 6,000 | |
| Enterprise | bespoke | bespoke | Set in your contract |
If you\u2019re bumping into a wall and a higher tier doesn\u2019t fit your shape, hello@genieos.pro — we negotiate custom rate windows for queue-y workloads.
Tips
- Spread, don\u2019t batch. A 600-message broadcast at 09:00 spread over 60s is friendlier than 600 calls in 1 second.
- Honour
Retry-After. Hammering during the cooldown re-arms the bucket and extends the lockout. The SDKs do this; raw HTTP must too. - Watch
X-RateLimit-Remaining. When it dips to 20% ofLimit, you\u2019re in the warning zone — log it, slow down, scale out. - One key per worker fleet. If your N web dynos all share a single key, the per-key bucket becomes your hard ceiling. Per-fleet keys parallelise correctly.
Sandbox keys are unlimited on purpose
Tests and CI shouldn\u2019t fail because the test runner spun up too many parallel jobs. Sandbox keys skip both rate limits. Don\u2019t use them in production.