User-agent Standard

Specifies which crawler the following rules apply to. Use * for all bots.

User-agent: Googlebot
User-agent: *
Disallow Standard

Blocks crawlers from accessing specified paths or files. Paths are relative to the root.

Disallow: /admin/
Disallow: /api/v1/
# Blocks entire path & subdirectories
Allow Standard

Explicitly permits crawling of a path that would otherwise be blocked. Overrides Disallow.

Disallow: /docs/
Allow: /docs/public/
Crawl-delay Standard

Requests bots wait between requests (in seconds). Primarily respected by Yandex and friendly bots.

User-agent: *
Crawl-delay: 2
Sitemap Standard

Points to the absolute URL of your XML sitemap to accelerate indexing of important pages.

Sitemap: https://example.com/sitemap.xml
Noimageindex Google

Prevents all images on the page from appearing in Google Image Search results.

User-agent: Googlebot
Noimageindex: /gallery/
Max-image-preview Google

Controls how much of a page is shown as a preview when hovering over search results.

Max-image-preview: large
# Options: none, standard, large
Noindex Google

Blocks Google from indexing pages matching the path. Alternative to meta noindex tags.

User-agent: Googlebot
Noindex: /checkout/
Clean-param Bing

Instructs Bing to ignore tracking parameters and consolidate duplicate URL variants.

User-agent: bingbot
Clean-param: utm_source|utm_medium
Adsbot Bing/Google

Controls access for ad-targeting crawlers. Separate from main web bots.

User-agent: AdsBot-Google
Disallow:
# Allow full access for ad data
AI-Optimize Platform

Enables dynamic rule generation based on content freshness, traffic patterns, and server load.

AI-Optimize: enabled
Priority: high-value-content
Edge-Sync Platform

Propagates rule updates to all CDN edge nodes instantly with zero-downtime deployment.

Edge-Sync: global
Rollback: auto
Bot-Fingerprint Platform

Advanced challenge-response for unverified crawlers. Routes them through a lightweight verification handshake.

Bot-Fingerprint: challenge
Timeout: 5s

Best Practices & Compliance

📐

Order Matters

Place specific directives before broad ones. Group by User-agent to avoid rule bleeding.

🔍

Test Before Pushing

Use Google Search Console or our Validator to verify path matching and bot targeting accuracy.

📄

Keep It Lean

Large files slow down parsing. Combine overlapping paths and use wildcards strategically.

🌍

Bot Expectations

Malicious bots ignore robots.txt. Use it for SEO control, not security. Pair with WAF rules.

Directive Syntax Validator

Paste your robots.txt content to instantly verify syntax, detect conflicts, and ensure platform compatibility.

Checks structure, typos & common misconfigurations