API Rate Limits
CloudNexus API endpoints enforce rate limits to ensure platform stability, fair resource allocation, and protection against abuse. This document outlines how rate limiting works, default thresholds by service, and best practices for handling throttling gracefully.
How Rate Limiting Works
Rate limits are evaluated per API key, per endpoint, using a sliding window algorithm. When you make a request, the response includes metadata headers indicating your current quota status:
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 997
X-RateLimit-Reset: 1625097600
Retry-After: 0
- X-RateLimit-Limit: Maximum requests allowed per window.
- X-RateLimit-Remaining: Requests left in the current window.
- X-RateLimit-Reset: Unix timestamp when the window resets.
- Retry-After: Seconds to wait before retrying (only present on 429 responses).
Default Rate Limits by Service
| Service / Endpoint | Window | Limit | Burst |
|---|---|---|---|
| Compute API (Servers/VMs) | 1 minute | 60 req | Up to 100 |
| Object Storage (S3-Compatible) | 1 minute | 1,000 req | Up to 2,500 |
| Kubernetes Control Plane | 1 minute | 120 req | None |
| Managed Databases | 1 minute | 50 req | Up to 75 |
| CDN Purge API | 5 minutes | 20 req | None |
| Authentication / Tokens | 15 minutes | 10 req | None |
| Billing & Invoices | 1 minute | 30 req | Up to 45 |
Handling 429 Too Many Requests
When you exceed a rate limit, the API returns a 429 status code with a JSON body containing error details. Always implement exponential backoff and respect the Retry-After header.
{
"error": {
"code": 429,
"message": "Rate limit exceeded. Please retry after 12 seconds.",
"docs": "https://docs.cloudnexus.com/api/rate-limits"
}
}
429 may result in temporary IP bans or API key suspension. Always implement proper retry logic.
Best Practices
- Cache Responses: Use HTTP caching headers (
ETag,Last-Modified) to reduce redundant calls. - Batch Operations: Where available, use batch endpoints instead of individual requests.
- Webhooks over Polling: Use our webhook system for event-driven updates instead of frequent API polling.
- Monitor Headers: Log
X-RateLimit-Remainingto proactively throttle your client before hitting limits. - Use SDKs: Official CloudNexus SDKs include built-in rate limit handling and automatic retries.
Requesting Higher Limits
Rate limits are automatically adjusted for higher-tier plans, but you can request custom thresholds for specific endpoints. Enterprise customers can negotiate dedicated API quotas as part of their SLA.
To request an increase:
- Use the
POST /api/v1/rate-limits/requestsendpoint with your use case details. - Or open a support ticket via the CloudNexus Console.
- Provide: API key prefix, expected QPS, peak traffic patterns, and business justification.
Limit adjustments typically take 24–48 hours to provision. You will receive an email confirmation once approved.
CloudNexus Rate Limiter middleware in your infrastructure to handle retries, jitter, and circuit breaking automatically.