Robots.txt

A plain-text file at a website's root that gives crawl-permission directives to search engines, including which paths to disallow.

robots.txt is a plain-text file served at a website's root (e.g. /robots.txt) that gives crawl-permission directives to search engines and other automated visitors. It can specify which paths a crawler is allowed to fetch and which it should skip, plus a reference to the sitemap.xml file.

For a multi-tenant wholesale portal, robots.txt typically allows the marketing lander and editorial pages while disallowing admin, account, and authenticated portal paths. AI crawler user agents (GPTBot, Claude-Web, Perplexity, Applebot-Extended) can be addressed individually if specific access policies are needed.

Sitemap
IndexNow

Apply for wholesale access

Related terms