API Reference

Programmatic access to crawl management, rule generation, and bot analytics. RESTful, JSON-based, and rate-limited to 1000 req/min.

BASE_URL: https://api.robots.txt/v1

All endpoints return JSON responses. Errors follow RFC 7807 Problem Details. Use the Authorization header for all requests.

Authentication

Robots.txt uses API keys for authentication. Generate keys in your Dashboard under Settings → API Access.

Header

Authorization: Bearer <YOUR_API_KEY>

Never expose your secret key in client-side code or public repositories. Use environment variables.

Crawl Rules

GET /rules

Retrieve a paginated list of all active crawl rules for the authenticated domain.

Parameter	Type	Description
`domain`Required	string	Target domain to query rules for
`user_agent`Optional	string	Filter by specific bot (e.g., `Googlebot`)
`limit`Optional	integer	Max results per page (default: 20, max: 100)

Response 200 OK

JSON

{
  "status: "success,
  "data: {
    "total: 14,
    "rules: [
      {
        "id: "rl_8f3a2b1c,
        "user_agent: "*,
        "directive: "Disallow,
        "path: "/api/v2/,
        "priority: 1,
        "created_at: "2024-05-12T09:30:00Z
      }
    ]
  }
}

POST /rules

Create a new crawl directive. The engine will automatically merge and validate against existing rules.

Request Body

JSON

{
  "domain: "example.com,
  "user_agent: "Bingbot,
  "directive: "Allow,
  "path: "/public/blog/,
  "crawl_delay: 2
}

Crawlers & Bots

GET /crawlers

List all detected crawlers interacting with your domain in the last 30 days.

cURL

curl -X GET https://api.robots.txt/v1/crawlers \\
  -H "Authorization: Bearer <KEY> \\
  -H "Content-Type: application/json

Sitemaps

POST /sitemaps/generate

Trigger an asynchronous sitemap regeneration. Returns a job ID for tracking.

Response 202 Accepted

JSON

{
  "status: "processing,
  "job_id: "job_9x2m4k1p,
  "eta_seconds: 45,
  "message: "Sitemap generation queued
}

Analytics

GET /analytics/crawl

Retrieve crawl activity metrics, including request volumes, blocked paths, and indexing status.

Parameter	Type	Description
`start_date`Required	string	ISO 8601 start timestamp
`end_date`Required	string	ISO 8601 end timestamp
`granularity`Optional	string	`hour`, `day`, or `week` (default: `day`)

Error Codes

The API uses standard HTTP status codes to indicate success or failure.

Code	Meaning	Description
400	Bad Request	Invalid parameters or malformed JSON body
401	Unauthorized	Missing or invalid API key
403	Forbidden	Insufficient permissions for the requested resource
404	Not Found	Domain or resource does not exist in your workspace
429	Too Many Requests	Rate limit exceeded. Retry after `X-RateLimit-Reset`
500	Server Error	Internal error. Check status page or contact support

Webhooks

Subscribe to real-time events like rule deployments, crawler blocks, or sitemap completions. Configure endpoints in your dashboard. All payloads include a X-Signature-256 header for verification.

SDKs & CLI

Official packages available for Node.js, Python, Go, and Rust. Install via package managers or use our CLI for terminal-based rule management.

Node.js

npm install @robots.txt/sdk

import { RobotsClient } from '@robots.txt/sdk';

const client = new RobotsClient({ apiKey: process.env.API_KEY });
const rules = await client.rules.list({ domain: 'example.com' });