The AEO Audit Checklist: 27 Checks That Tell You Exactly Why AI Is Not Citing You

Most AEO problems are not content problems. The team writes good content. They add FAQs. They research what competitors are ranking for. Months later, citation rate has not moved.

The reason is almost always upstream. A WAF rule blocks PerplexityBot before it reads a single word. The main product page renders in React with no SSR, so AI crawlers see an empty container. The Organisation schema has a sameAs link pointing to a LinkedIn page that was renamed two years ago. These are not content failures. They are infrastructure failures that content investment cannot fix.

This checklist runs in priority order. Crawl issues come first because no other fix matters until crawlers can reach your pages. Content structure comes second because that is what determines whether a retrieved page earns a citation. Entity signals come third because that is the layer that builds durable citation authority over time. Work through it in sequence, not by picking the items that feel most familiar.

Section 1: Crawler Access (Fix These Before Anything Else)

Failing any check in this section makes every other optimisation irrelevant. AI engines cannot cite pages they cannot reach.

Check 1: PerplexityBot allowed in robots.txt. Open your robots.txt and confirm an explicit Allow rule exists for PerplexityBot. An omission is not the same as an allow. WAF rules can block bots regardless of robots.txt permissions, so this check also requires server log verification.

Check 2: OAI-SearchBot and ChatGPT-User allowed. These are the two separate OpenAI crawlers that power ChatGPT Search. Blocking GPTBot (the training crawler) is fine and does not affect citations. Blocking OAI-SearchBot or ChatGPT-User removes your pages from ChatGPT Search indexing. Confirm both are explicitly allowed.

Check 3: Claude-SearchBot allowed. Anthropic's retrieval crawler is separate from ClaudeBot (the training crawler). Same pattern as OpenAI: you can block the training crawler and allow the retrieval crawler independently.

Check 4: Google-Extended allowed. This controls Google's access for AI training. Blocking it does not directly affect AI Overview or AI Mode citations, which use the standard Googlebot index, but it affects training data inclusion for future model updates.

Check 5: Server log evidence of crawler activity. robots.txt permissions do not guarantee crawler access. WAF rules, CDN bot management settings, and hosting provider bot filters can block at a network layer that robots.txt never reaches. Pull server logs for the past 30 days and confirm each AI crawler's user-agent string appears on your most important pages. Zero log entries despite correct robots.txt almost always means a WAF or CDN block. Check Cloudflare Bot Fight Mode specifically — it catches AI crawlers by default.

Check 6: Critical content in initial HTML. AI crawlers do not execute JavaScript. For any page where the main content loads client-side after JavaScript runs, AI crawlers see an empty container. Run curl -A "PerplexityBot" https://yourdomain.com/your-key-page/ and confirm your answer content is present in the plain HTML output. If it is not, server-side rendering or static generation is required for those pages.

Check 7: Page returns HTTP 200. Pages returning 3xx redirects, 4xx errors, or 5xx errors are not crawlable. Crawl your most important pages and confirm 200 status codes. Chains of more than two redirects reduce crawl probability further.

Check 8: LCP under 2.5 seconds. Page speed affects crawl depth and frequency. Pages with poor Largest Contentful Paint get crawled less completely. AI Overview indexing specifically correlates with Core Web Vitals performance, per NotionCue's internal data across 8,000 tracked domains.

Section 2: Content Structure

These checks determine whether a page that successfully passes the crawl gate earns a citation or gets discarded after retrieval.

Check 9: Answer-first opening paragraph. Does your page's first paragraph answer the main question directly in 40 to 60 words? Not "In this article we will explore..." — an actual answer. SparkToro's 2026 citation analysis found 44.2% of all AI citations come from the first 30% of content. The opening paragraph is the highest-value single element on the page.

Check 10: Question-format H2 headings. Are at least 60% of your H2 headings phrased as questions your target audience asks? "What is AEO?" outperforms "AEO Definition" for citation selection because it creates heading-to-query alignment for AI retrieval sub-queries.

Check 11: Self-contained answer blocks. Does each section open with a complete answer in the first sentence that makes sense without surrounding context? AI systems extract passages at the paragraph level. A paragraph that requires the preceding paragraph to make sense will not be cited as a standalone passage.

Check 12: No content buried below 800 words without an answer block. Pages with strong openings that then fall into generic content after 800 words lose citation potential in the latter half. Each H2 section needs its own answer block regardless of where it falls in the page.

Check 13: Entity naming — no pronouns replacing key terms. AI retrieval systems extract passages individually. If your passage says "it reduces latency" instead of "server-side rendering reduces latency," the citation cannot be attributed to the correct topic. Name the entity in full on first mention in each paragraph.

Check 14: Sourced statistics with named attribution. Unsupported claims are not citable. Every statistic should have a named source and a date. "60% of searches end without a click" should specify "(SparkToro, 2026)" or link to the source. Perplexity cross-references claims against other sources; unverifiable claims reduce citation confidence.

Check 15: FAQ section with question-format items. Pages with at least five specific, buyer-phrased FAQ questions at the end earn citations for a wider range of sub-queries. The FAQ section also provides the Q&A pairs that go directly into FAQPage JSON-LD schema.

Check 16: Visible publication date and last updated date. AI systems use date signals to assess freshness. A page with no visible date, or a dateModified that has not changed in 18 months, signals stale content. The Amsive 2026 benchmark shows 50% of AI citations going to content updated in the past 13 weeks.

Section 3: Schema and Structured Data

Check 17: FAQPage JSON-LD present and valid. FAQPage schema is the single highest-impact structured data type for AEO. Each question-answer pair in the schema is a directly extractable unit for AI retrieval. Validate at search.google.com/test/rich-results. Note: Google removed FAQPage rich results from standard search in May 2026 but continues to parse the schema for AI retrieval. Do not remove it.

Check 18: Article schema with datePublished and dateModified. dateModified is the most under-used field in Article schema and one of the highest-impact AEO signals. AI Mode and AI Overviews both weight freshness via dateModified. Update it every time the content changes materially.

Check 19: Organisation schema with sameAs array. Present on your homepage and linked from all content pages via the publisher field in Article schema. The sameAs array links your brand entity to LinkedIn, Crunchbase, Wikidata, and other profiles. Each accurate link increases entity confidence in AI knowledge graphs.

Check 20: Person schema on author pages. Every named author needs a Person schema with jobTitle, sameAs linking to LinkedIn, and knowsAbout covering your topic areas. Anonymous content has lower E-E-A-T scores across all AI engines. See the E-E-A-T and AI citation guide for full implementation.

Check 21: BreadcrumbList schema on content pages. Topical hierarchy signals affect how AI engines assess page authority within a cluster. A page sitting inside a structured content cluster signals more topical authority than an isolated page with the same content.

Section 4: Entity Signals

Check 22: Brand name consistent across all platforms. Run your brand name through LinkedIn, Crunchbase, G2, Capterra, and any other platforms where you have a profile. Inconsistencies in company name, product names, or descriptions create entity disambiguation failures in AI knowledge graphs. AI systems encountering inconsistent signals hedge their descriptions of your brand or avoid citing you for high-confidence claims. The entity-based AEO guide covers the full consistency audit.

Check 23: Wikidata entry exists and is accurate. Wikidata is the highest-value single entity signal for most brands. It requires lower notability thresholds than Wikipedia. A Wikidata entry with founding date, headquarters, industry, and a link to your official website provides a machine-readable entity anchor that AI knowledge graphs treat as authoritative.

Check 24: Review platform profiles complete and recent. G2, Capterra, and Clutch for SaaS and B2B. Trustpilot for consumer products. Complete profiles with recent reviews provide external corroboration that on-site entity signals cannot replicate. For commercial and comparison queries, AI engines weight these platforms heavily. SE Ranking research found brands with strong review platform presence earn 4x higher AI citation rates than equivalent brands without it.

Check 25: AI brand description accuracy test. Run "What is [your brand]?" through ChatGPT, Perplexity, and Claude. Record exactly what each says. Compare against your current product description, pricing, and feature set. Any discrepancy is an active hallucination that may cost you consideration. The brand hallucination guide covers the correction process.

Section 5: Tracking

Check 26: Prompt tracking set up for target queries. You cannot improve what you are not measuring. A set of 15 tracked prompts across ChatGPT, Perplexity, and Google AI Mode, run weekly, gives you the citation rate data needed to see whether changes are working. Without this, you are optimising blind. See the AEO measurement guide.

Check 27: AI referral traffic segment in GA4. Create a GA4 segment filtering sessions from chatgpt.com, perplexity.ai, and claude.ai. Track weekly. Track conversion rate for this segment separately from organic. The 3 to 4x conversion premium for AI-referred traffic makes this the highest-quality segment in your acquisition mix. If it is flat or absent, citation is not converting to traffic — which means your cited pages may not have clear next-step pathways for visitors.

The NotionCue AI Crawler Audit automates Checks 1 through 8 and surfaces which pages are being fetched by which crawlers, which pages return empty content due to JavaScript rendering, and which crawler user-agents are absent from your logs. Run it before touching content or schema — fixing access issues first means every subsequent change actually reaches AI engines.

Frequently Asked Questions

How often should I run an AEO audit?
Full audit quarterly. Spot-check the crawl section monthly — WAF rules and CDN updates can silently break crawler access without any notification. Check the brand description accuracy section after any product update, rebrand, or major content change.

Which section produces the fastest citation improvements when fixed?
Section 1 (crawler access) produces the fastest improvement when a block is present — you can go from zero citations to meaningful citation volume within days of fixing a WAF rule. Section 3 (schema) typically produces changes within one to two weeks. Sections 4 and 5 (entity signals and tracking) take longer to propagate but compound over time.

Can I pass all 27 checks and still have low citation rates?
Yes, if the content itself is thin, generic, or covers topics your competitors address with more depth or more specific data. The checklist removes barriers to citation. It does not guarantee citation if the content is not genuinely the best source for the query. Topical authority and content depth, covered in the topical authority guide, are what drive citation rates beyond baseline.

The AEO Audit Checklist: 27 Checks That Tell You Exactly Why AI Is Not Citing You

Section 1: Crawler Access (Fix These Before Anything Else)

Section 2: Content Structure

Section 3: Schema and Structured Data

Section 4: Entity Signals

Section 5: Tracking

Frequently Asked Questions

Image AEO: Alt Text, ImageObject Schema, and Visual AI Citations in 2026

Multilingual AEO: How to Earn AI Citations Across Languages and Markets

AggregateRating and Review Schema for AEO: How Star Ratings Enter AI Citations

Podcast AEO: Why Your Audio Is Invisible to AI Engines (and the Fix Takes One Week)