Rate Limits
The API enforces rate limits to ensure fair usage and platform stability.
Current Limits
| Resource | Limit | Window |
|---|---|---|
| API requests | 600 | per minute |
| Upload initiations | 100 | per minute |
| Analysis triggers | 50 | per minute |
Rate Limit Headers
All API responses include rate limit information:
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 598
X-RateLimit-Reset: 1704280560
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed per window |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Rate Limit Exceeded Response
When you exceed the rate limit, you'll receive a 429 response:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Too many requests. Please retry after 60 seconds.",
"details": {
"retry_after": 60
}
},
"meta": {
"request_id": "req_xxxxxxxxxxxx"
}
}
The response also includes a Retry-After header:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Best Practices
1. Monitor Rate Limit Headers
Check headers on every response and back off before hitting limits:
def check_rate_limit(response):
remaining = int(response.headers.get("X-RateLimit-Remaining", 0))
if remaining < 10:
# Start slowing down requests
time.sleep(1)
if remaining < 5:
# Wait for reset
reset_time = int(response.headers.get("X-RateLimit-Reset", 0))
wait_seconds = max(0, reset_time - time.time())
time.sleep(wait_seconds)
2. Implement Exponential Backoff
When rate limited, use exponential backoff with jitter:
async function fetchWithBackoff(url, options, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.status !== 429) {
return response;
}
const retryAfter =
parseInt(response.headers.get("Retry-After")) ||
Math.pow(2, attempt) + Math.random();
await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));
}
throw new Error("Max retries exceeded");
}
3. Batch Operations
Instead of making many individual requests, use batch-friendly patterns:
# ❌ Bad: One request per item
for content_id in content_ids:
get_content(content_id)
# ✅ Good: Use list endpoint with filtering
contents = list_content(project_id, status="completed", limit=100)
4. Cache Responses
Cache frequently-accessed data to reduce API calls:
from functools import lru_cache
@lru_cache(maxsize=100, ttl=300) # Cache for 5 minutes
def get_project(project_id):
return api.get(f"/projects/{project_id}")
5. Use Webhooks for Updates
Instead of polling, subscribe to webhooks for real-time updates:
# ❌ Bad: Polling every 10 seconds
while True:
content = get_content(content_id)
if content["status"] == "completed":
break
time.sleep(10)
# ✅ Good: Receive webhook when ready
# POST to your endpoint: {"event": "transcription.completed", ...}
Increasing Limits
If you need higher rate limits for your use case, please contact us at support@scribesight.com with:
- Your organization ID
- Expected request volume
- Use case description