# Robots.txt for Sitemap.xml # https://sitemap.xml.com/robots.txt # Last Updated: 2025-01-15 # Purpose: Control crawler access & declare sitemap locations User-agent: * Disallow: /admin/ Disallow: /api/internal/ Disallow: /cgi-bin/ Disallow: /wp-login.php Allow: / Sitemap: https://sitemap.xml.com/sitemap.xml Sitemap: https://sitemap.xml.com/blog-sitemap.xml Sitemap: https://sitemap.xml.com/images-sitemap.xml Crawl-delay: 1 # Search Engine Specific Rules User-agent: Googlebot Allow: / Sitemap: https://sitemap.xml.com/priority-sitemap.xml User-agent: Bingbot Allow: / Crawl-delay: 2 # Disallow aggressive/non-standard bots User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: DotBot Disallow: /
This /robots.txt directive file governs automated crawler behavior for Sitemap.xml. Public-facing routes are explicitly permitted while administrative panels, internal API endpoints, and login paths are restricted. Three primary sitemap endpoints are registered to ensure comprehensive indexation across core, blog, and media assets. Aggressive SEO spider bots are blocked to preserve server resources.