API Overview v2.4.0
The Sitemap.xml API provides programmatic access to sitemap generation, indexing submission, crawl status monitoring, and analytics. All requests are stateless and idempotent where applicable.
Base URL: https://api.sitemap.xml/v2
Authentication
All API requests require authentication via Bearer tokens or API keys. Tokens must be included in the Authorization header.
Authorization: Bearer sm_live_pk_8f7a3b9c2d1e0f5g6h4i3j2k1l0m9n8o
X-Request-Id: req_123456789
Content-Type: application/json
Core Endpoints & Rate Limits
| Method | Endpoint | Description | Rate Limit |
|---|---|---|---|
| POST | /index |
Submit URL for immediate indexing | 1,000/min |
| GET | /status/{id} |
Check indexing/crawling status | 5,000/min |
| PUT | /sitemaps/{id} |
Update sitemap configuration | 500/min |
| GET | /analytics/overview |
Retrieve crawl & indexing metrics | 200/min |
| POST | Batch submit up to 10,000 URLs | 100/min |
Official SDKs
Native client libraries for rapid integration. All SDKs support automatic retries, exponential backoff, and connection pooling.
import { SitemapClient } from '@sitemap/sdk';
const client = new SitemapClient({
apiKey: process.env.SITEMAP_API_KEY,
timeout: 5000,
retries: 3
});
const result = await client.index.submit({
url: 'https://example.com/new-post',
priority: 0.8,
notify: ['google', 'bing']
});
console.log(result.statusId);
System Architecture
Built on a distributed, event-driven architecture optimized for high-throughput sitemap generation and real-time search engine indexing pipelines.
Edge Layer:
└─ Cloudflare Workers + Fastly CDN (Global Anycast)
└─ TLS 1.3 Termination + DDoS Mitigation
API Layer:
└─ Go (Gin) Microservices @ 4096 MHz
└─ gRPC Internal Routing + JSON REST Externally
└─ Rate Limiting: Token Bucket + Sliding Window
Processing:
└─ Distributed Crawlers (Headless Chromium + Custom)
└─ XML Generator: Streaming SAX Parser (Low Memory)
└─ Queue: Apache Kafka (12-Cluster Topology)
Storage:
└─ Primary: PostgreSQL 16 (ReadWrite Replicas)
└─ Cache: Redis Cluster (6 Nodes, AOF + RDB)
└─ Object: S3-Compatible (Sitemap Binary Storage)
Observability:
└─ OpenTelemetry + Prometheus + Grafana
└─ Distributed Tracing (Jaeger)
└─ Audit Logs: Immutable WORM Storage
Data Formats & Standards
All outputs strictly adhere to W3C and IETF standards for maximum search engine compatibility.
| Format | Standard | Compression | Max Size |
|---|---|---|---|
| XML Sitemap | 0.9 (W3C) |
GZIP / Brotli | 50MB / 50,000 URLs |
| Sitemap Index | RFC 9296 |
GZIP | Unlimited (Indexed) |
| JSON-LD | Schema.org |
None | Per-page injection |
| RSS 2.0 | RFC 4287 (Atom compat) |
GZIP | 10MB / 1,000 items |
Performance Metrics
Measured across global edge locations under standard production load. Benchmarks reflect p95 and p99 latencies.
Auto-scaling policies trigger at 75% CPU utilization across API nodes. Database read replicas dynamically provision during peak crawl windows. CDN edge caching reduces origin hits by ~94% for static sitemap assets.
Security & Compliance
Enterprise-grade security architecture with zero-trust principles, end-to-end encryption, and comprehensive audit trails.
| Feature | Specification |
|---|---|
| Transport Encryption | TLS 1.3 enforced (1.2 fallback disabled) |
| Data at Rest | AES-256-GCM with automatic key rotation (90d) |
| Authentication | OAuth 2.0 + PKCE, API Keys (Scoped), JWT (Short-lived) |
| Access Control | RBAC with attribute-based policies (ABAC) |
| Audit Logging | Immutable logs, 7-year retention, SIEM export |
| Compliance | SOC 2 Type II, GDPR, CCPA, ISO 27001 |
| DDoS Protection | Cloudflare Enterprise + Custom WAF Rules |
Webhook Specifications
Real-time event notifications delivered via HTTP POST. All payloads are signed using HMAC-SHA256 for integrity verification.
{
"id": "evt_9f8a7b6c5d4e3f2g1h0i",
"type": "indexing.completed",
"timestamp": "2025-01-15T14:32:01Z",
"data": {
"url": "https://example.com/page",
"status": "indexed",
"engine": "google",
"latency_ms": 84
}
}
// Verification Header: X-Sitemap-Signature: sha256=...