Rate Limits

The API enforces rate limits to ensure fair usage and platform stability.

Current Limits

Resource	Limit	Window
API requests	600	per minute
Upload initiations	100	per minute
Analysis triggers	50	per minute

Rate Limit Headers

All API responses include rate limit information:

X-RateLimit-Limit: 600
X-RateLimit-Remaining: 598
X-RateLimit-Reset: 1704280560

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed per window
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when the window resets

Rate Limit Exceeded Response

When you exceed the rate limit, you'll receive a 429 response:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Too many requests. Please retry after 60 seconds.",
    "details": {
      "retry_after": 60
    }
  },
  "meta": {
    "request_id": "req_xxxxxxxxxxxx"
  }
}

The response also includes a Retry-After header:

HTTP/1.1 429 Too Many Requests
Retry-After: 60

Best Practices

1. Monitor Rate Limit Headers

Check headers on every response and back off before hitting limits:

def check_rate_limit(response):
    remaining = int(response.headers.get("X-RateLimit-Remaining", 0))
    
    if remaining < 10:
        # Start slowing down requests
        time.sleep(1)
    
    if remaining < 5:
        # Wait for reset
        reset_time = int(response.headers.get("X-RateLimit-Reset", 0))
        wait_seconds = max(0, reset_time - time.time())
        time.sleep(wait_seconds)

2. Implement Exponential Backoff

When rate limited, use exponential backoff with jitter:

async function fetchWithBackoff(url, options, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    const retryAfter =
      parseInt(response.headers.get("Retry-After")) ||
      Math.pow(2, attempt) + Math.random();

    await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));
  }

  throw new Error("Max retries exceeded");
}

3. Batch Operations

Instead of making many individual requests, use batch-friendly patterns:

# ❌ Bad: One request per item
for content_id in content_ids:
    get_content(content_id)

# ✅ Good: Use list endpoint with filtering
contents = list_content(project_id, status="completed", limit=100)

4. Cache Responses

Cache frequently-accessed data to reduce API calls:

from functools import lru_cache

@lru_cache(maxsize=100, ttl=300)  # Cache for 5 minutes
def get_project(project_id):
    return api.get(f"/projects/{project_id}")

5. Use Webhooks for Updates

Instead of polling, subscribe to webhooks for real-time updates:

# ❌ Bad: Polling every 10 seconds
while True:
    content = get_content(content_id)
    if content["status"] == "completed":
        break
    time.sleep(10)

# ✅ Good: Receive webhook when ready
# POST to your endpoint: {"event": "transcription.completed", ...}

Increasing Limits

If you need higher rate limits for your use case, please contact us at support@scribesight.com with:

Your organization ID
Expected request volume
Use case description

Current Limits​

Rate Limit Headers​

Rate Limit Exceeded Response​

Best Practices​

1. Monitor Rate Limit Headers​

2. Implement Exponential Backoff​

3. Batch Operations​

4. Cache Responses​

5. Use Webhooks for Updates​

Increasing Limits​