NotionCue
AI Visibility Platform
All systems live
Dashboard →
AEO GuideCluster Mapllms.txt GeneratorRobots.txt GeneratorBLUF TemplatesBlogChangelogAboutPrivacyTermsContact
Tools

llms.txt
Live Validator

Enter any domain and live-fetch their actual llms.txt. Validates format, checks every major AI bot declaration, and flags conflicts with robots.txt — for your own site or any competitor.

How to use this tool

Three things you can do
with this validator.

🔍
Audit your own site
After deploying a new or updated llms.txt, run the validator to confirm the file is live at the right URL, correctly formatted, and that there are no conflicts with your robots.txt that would nullify its bot declarations.
🔎
Check competitors
Enter any competitor domain to see their AI bot configuration. A competitor with a broken llms.txt or blocking robots.txt rule is effectively invisible to the AI engines citing content in your category — that's a citation gap you can fill.
⚠️
Diagnose missing citations
If your domain has good AEO scores but low actual citation rates, a conflict between llms.txt and robots.txt is one of the most common silent causes. The validator surfaces these mismatches line by line with a specific fix for each.
What is llms.txt

The standard that tells AI
what it's allowed to read.

llms.txt is an emerging specification, analogous to robots.txt, that gives website owners a structured way to declare their content's availability for AI training, retrieval, and citation. Where robots.txt controls crawl access at the page level, llms.txt operates at the intent level — telling AI systems what the site is, what it covers, and how its content should be used.

The file lives at the root of your domain, is plain text, and follows a simple key-value format. A well-formed llms.txt includes the site name, a description of what the site covers, contact information, and per-bot Allow or Disallow declarations. Engines that support llms.txt — including GPTBot and PerplexityBot — check for the file before indexing content.

The specification is still maturing. Different AI engines read it with different levels of strictness, and not all treat it as binding the way browsers treat robots.txt. But sites with a correctly formatted llms.txt consistently show higher citation rates than comparable sites without one — the signal it sends about technical AEO maturity compounds over time.

yourdomain.com/llms.txt — correct format
# llms.txt

Name: Your Brand
Description: What your site covers
Contact: email@yourdomain.com

User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /
The 8 AI bots we check

Each bot has a different role
in the citation pipeline.

GP
GPTBot
ChatGPT
OpenAI's crawler. Blocking GPTBot means ChatGPT cannot include your content in browse-mode responses. High priority.
Pe
PerplexityBot
Perplexity
Perplexity fetches live content for every answer. GPTBot equivalent in citation importance, especially for retrieval-based answers.
Cl
ClaudeBot
Claude
Anthropic's crawler. Claude is increasingly used for research queries where domain citation matters for authority.
Go
Google-Extended
Gemini
Google's AI training crawler. Distinct from Googlebot — blocking Google-Extended while allowing Googlebot is a common misconfiguration.
Am
Amazonbot
Alexa / Kendra
Amazon's crawler, used for Alexa voice search and Amazon Kendra enterprise search.
By
Bytespider
TikTok / Doubao
ByteDance's crawler. Powers TikTok's search features and Doubao AI assistant.
Fa
FacebookBot
Meta AI
Meta's crawler for training its AI systems, including Meta AI across Facebook and Instagram.
CC
CCBot
Common Crawl
Powers multiple open-weight models. Not directly tied to one consumer engine but feeds training datasets broadly.
Who this tool helps

Technical SEOs and AEO practitioners
who need real data, not estimates.

This tool is for anyone who needs to verify AI bot configuration rather than assume it's correct. The most common users are technical SEOs auditing a client site, content strategists who want to know why their content isn't being cited despite good AEO scores, and competitive intelligence practitioners monitoring how rivals configure their AI access settings.

Before a site launch
Confirm llms.txt is live and correctly formatted before launch day. A 404 on /llms.txt is easy to miss in pre-launch QA and leaves the site invisible to AI crawlers from day one.
After a CMS migration
Platform migrations often break root-level files. llms.txt and robots.txt are both easy to lose in a migration — validate both immediately after any significant site infrastructure change.
When citations suddenly drop
If your AI citation rate drops with no obvious content changes, a broken llms.txt or a new robots.txt rule blocking AI bots is one of the first places to check.
Competitive intelligence
Knowing a competitor has blocked GPTBot while you have it allowed is actionable data. Their ChatGPT citations will decline; yours can fill the gap if your content is strong enough.
FAQ

Common questions.

What is llms.txt and why does it matter for AEO?
What's the difference between this validator and your llms.txt generator?
Which AI bots does the validator check?
What does a conflict between llms.txt and robots.txt mean?
Can I use this to audit competitor sites?