What is llms.txt?
llms.txt is a plain text file placed at the root of your domain that communicates crawl permissions and site metadata to AI language model systems. It is modelled on robots.txt but designed specifically for LLM training pipelines and retrieval-augmented generation systems.
Which AI bots check llms.txt?
- ClaudeBot (Anthropic) โ respects llms.txt and robots.txt.
- PerplexityBot โ respects llms.txt and robots.txt. Real-time retrieval; fast to reflect changes.
- GPTBot (OpenAI) โ primarily uses robots.txt; llms.txt signals incorporated into training data selection.
- Google-Extended โ respects robots.txt. llms.txt support experimental.
- Bingbot / Copilot โ primarily robots.txt. No confirmed llms.txt support as of Q2 2026.
The correct file format
# llms.txt
Name: Your Site Name
Description: One sentence describing what your site is and who it serves.
Domain: https://yourdomain.com
Contact: seo@yourdomain.com
Language: en
User-agent: GPTBot
Allow: /
Disallow: /admin
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
Common mistakes
- Disallowing all bots in robots.txt โ robots.txt takes precedence. Fix robots.txt first.
- Wrong file location โ must be at domain root, not in a subdirectory.
- Missing Name and Description fields โ these are used by AI systems to understand site context.
- Using HTML or JSON format โ must be plain text, served as text/plain.
The most common issue in AEOvision audits is a robots.txt that blocks AI crawlers while llms.txt allows them. The robots.txt always wins. Fix robots.txt first, then set llms.txt permissions.