Back to Documentation

Rate Limiting

Control request rates and set quotas to manage costs and prevent abuse.

Configure Rate Limits

Set rate limits per API key, user, or organization:

// Rate limit configuration
{
  "limits": {
    "requests_per_minute": 60,
    "requests_per_day": 10000,
    "tokens_per_minute": 100000,
    "tokens_per_day": 1000000
  },
  "scope": "api_key",
  "action": "reject"
}

Limit Types

Request Limits

Limit the number of API calls per time window.

Token Limits

Limit total tokens consumed per time window.

Cost Limits

Set spending caps to control costs.

Handling Rate Limit Errors

try {
  const response = await client.gateway.chat({...});
} catch (error) {
  if (error.code === 'RATE_LIMITED') {
    console.log('Retry after:', error.retryAfter);
  }
}