Rate Limits
Claude API rate limits — 429 handling, Retry-After, concurrency protection, and throughput guidance.
Status code: 429 Too Many Requests. Header: Retry-After (seconds). Scope: Per-key + global protection. Strategy: Exponential backoff with jitter.
Rate Limiting
The ClaudeStore gateway applies protective limits to preserve service stability under mixed traffic patterns. Enforcement is not a fixed public tier table: it depends on request shape, concurrency, and current gateway load. When you exceed a limit, the API returns a 429 status code along with a Retry-After header.
What is limited
| Scope | What it means |
|---|---|
| Per API key | A key can be throttled independently of other keys. |
| Per-key concurrency | Too many simultaneous long-running requests can trigger 429. |
| Global gateway protection | The gateway can shed load to protect service stability. |
| Attachment-heavy traffic | Large multimodal turns may be treated more strictly than plain text traffic. |
We do not publish a stable numeric RPM/TPM contract for every request shape. If you need sustained higher throughput, contact support with your expected traffic pattern.
Response headers
retry-after— seconds to wait before retrying after a 429
Do not rely on undocumented rate-limit headers as a stable public contract.
Handling 429 Responses
- Implement exponential backoff (start at 1s, double each retry)
- Respect
Retry-Afterheaders when present - Queue requests in your application layer and reduce parallelism when needed
- Be especially conservative with long-running streams and multimodal turns
- Use caching to reduce avoidable repeat traffic