Skip to main content

Rate Limits

The API enforces rate limits to ensure fair usage and platform stability.

Current Limits

ResourceLimitWindow
API requests600per minute
Upload initiations100per minute
Analysis triggers50per minute

Rate Limit Headers

All API responses include rate limit information:

X-RateLimit-Limit: 600
X-RateLimit-Remaining: 598
X-RateLimit-Reset: 1704280560
HeaderDescription
X-RateLimit-LimitMaximum requests allowed per window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets

Rate Limit Exceeded Response

When you exceed the rate limit, you'll receive a 429 response:

{
"error": {
"code": "rate_limit_exceeded",
"message": "Too many requests. Please retry after 60 seconds.",
"details": {
"retry_after": 60
}
},
"meta": {
"request_id": "req_xxxxxxxxxxxx"
}
}

The response also includes a Retry-After header:

HTTP/1.1 429 Too Many Requests
Retry-After: 60

Best Practices

1. Monitor Rate Limit Headers

Check headers on every response and back off before hitting limits:

def check_rate_limit(response):
remaining = int(response.headers.get("X-RateLimit-Remaining", 0))

if remaining < 10:
# Start slowing down requests
time.sleep(1)

if remaining < 5:
# Wait for reset
reset_time = int(response.headers.get("X-RateLimit-Reset", 0))
wait_seconds = max(0, reset_time - time.time())
time.sleep(wait_seconds)

2. Implement Exponential Backoff

When rate limited, use exponential backoff with jitter:

async function fetchWithBackoff(url, options, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, options);

if (response.status !== 429) {
return response;
}

const retryAfter =
parseInt(response.headers.get("Retry-After")) ||
Math.pow(2, attempt) + Math.random();

await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));
}

throw new Error("Max retries exceeded");
}

3. Batch Operations

Instead of making many individual requests, use batch-friendly patterns:

# ❌ Bad: One request per item
for content_id in content_ids:
get_content(content_id)

# ✅ Good: Use list endpoint with filtering
contents = list_content(project_id, status="completed", limit=100)

4. Cache Responses

Cache frequently-accessed data to reduce API calls:

from functools import lru_cache

@lru_cache(maxsize=100, ttl=300) # Cache for 5 minutes
def get_project(project_id):
return api.get(f"/projects/{project_id}")

5. Use Webhooks for Updates

Instead of polling, subscribe to webhooks for real-time updates:

# ❌ Bad: Polling every 10 seconds
while True:
content = get_content(content_id)
if content["status"] == "completed":
break
time.sleep(10)

# ✅ Good: Receive webhook when ready
# POST to your endpoint: {"event": "transcription.completed", ...}

Increasing Limits

If you need higher rate limits for your use case, please contact us at support@scribesight.com with:

  • Your organization ID
  • Expected request volume
  • Use case description