Rate Limits
Rate limits protect the API from excessive usage and ensure fair access for all consumers.
Current Limits
| Endpoint | Limit |
|---|---|
POST /api/v1/search/query | 60 requests / minute (default) |
Other endpoints | Varies by endpoint |
Search and tool calls share one per-organization requests/minute budget. The default is 60; your organization may have a custom limit — contact support to adjust it.
Response Headers
Every response includes rate limit headers so you can track your usage:
X-RateLimit-Limit : Maximum requests allowed per window.
X-RateLimit-Remaining : Requests remaining in the current window.
X-RateLimit-Reset : Unix timestamp when the window resets.
Exceeding the Limit
When you exceed the rate limit, the API returns 429 Too Many Requests with a Retry-After header:
Best Practices
- Respect Retry-After: wait the indicated time before retrying.
- Use exponential backoff: on repeated 429s, double your wait time between retries.
- Cache results: avoid redundant requests for the same query.
- Monitor your usage: check
X-RateLimit-Remainingto proactively throttle before hitting limits.