API Rate Limits

CloudNexus API endpoints enforce rate limits to ensure platform stability, fair resource allocation, and protection against abuse. This document outlines how rate limiting works, default thresholds by service, and best practices for handling throttling gracefully.

How Rate Limiting Works

Rate limits are evaluated per API key, per endpoint, using a sliding window algorithm. When you make a request, the response includes metadata headers indicating your current quota status:

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 997
X-RateLimit-Reset: 1625097600
Retry-After: 0

X-RateLimit-Limit: Maximum requests allowed per window.
X-RateLimit-Remaining: Requests left in the current window.
X-RateLimit-Reset: Unix timestamp when the window resets.
Retry-After: Seconds to wait before retrying (only present on 429 responses).

Note: Rate limits are applied per authenticated API key. Anonymous or unauthenticated requests are subject to stricter global limits and may be blocked without warning.

Default Rate Limits by Service

Service / Endpoint	Window	Limit	Burst
Compute API (Servers/VMs)	1 minute	60 req	Up to 100
Object Storage (S3-Compatible)	1 minute	1,000 req	Up to 2,500
Kubernetes Control Plane	1 minute	120 req	None
Managed Databases	1 minute	50 req	Up to 75
CDN Purge API	5 minutes	20 req	None
Authentication / Tokens	15 minutes	10 req	None
Billing & Invoices	1 minute	30 req	Up to 45

Handling `429 Too Many Requests`

When you exceed a rate limit, the API returns a 429 status code with a JSON body containing error details. Always implement exponential backoff and respect the Retry-After header.

{
  "error": {
    "code": 429,
    "message": "Rate limit exceeded. Please retry after 12 seconds.",
    "docs": "https://docs.cloudnexus.com/api/rate-limits"
  }
}

Important: Repeatedly hammering the API after receiving a 429 may result in temporary IP bans or API key suspension. Always implement proper retry logic.

Best Practices

Cache Responses: Use HTTP caching headers (ETag, Last-Modified) to reduce redundant calls.
Batch Operations: Where available, use batch endpoints instead of individual requests.
Webhooks over Polling: Use our webhook system for event-driven updates instead of frequent API polling.
Monitor Headers: Log X-RateLimit-Remaining to proactively throttle your client before hitting limits.
Use SDKs: Official CloudNexus SDKs include built-in rate limit handling and automatic retries.

Requesting Higher Limits

Rate limits are automatically adjusted for higher-tier plans, but you can request custom thresholds for specific endpoints. Enterprise customers can negotiate dedicated API quotas as part of their SLA.

To request an increase:

Use the POST /api/v1/rate-limits/requests endpoint with your use case details.
Or open a support ticket via the CloudNexus Console.
Provide: API key prefix, expected QPS, peak traffic patterns, and business justification.

Limit adjustments typically take 24–48 hours to provision. You will receive an email confirmation once approved.