Back to Documentation
Rate Limiting
Control request rates and set quotas to manage costs and prevent abuse.
Configure Rate Limits
Set rate limits per API key, user, or organization:
// Rate limit configuration
{
"limits": {
"requests_per_minute": 60,
"requests_per_day": 10000,
"tokens_per_minute": 100000,
"tokens_per_day": 1000000
},
"scope": "api_key",
"action": "reject"
}Limit Types
Request Limits
Limit the number of API calls per time window.
Token Limits
Limit total tokens consumed per time window.
Cost Limits
Set spending caps to control costs.
Handling Rate Limit Errors
try {
const response = await client.gateway.chat({...});
} catch (error) {
if (error.code === 'RATE_LIMITED') {
console.log('Retry after:', error.retryAfter);
}
}