What is llms.txt and do I need one?
llms.txt is a plain-text file at your domain root — like robots.txt — that gives AI assistants a structured summary of your website. What your business does, key content areas, how to navigate your site. It helps AI systems understand your business without crawling every page. If you want AI visibility, you need one.
What should I include in my llms.txt file?
A one-line business description, your main services or products, key content areas with URLs, site structure and important pages, contact info, and any specialized terminology. Use markdown formatting with clear headings. Keep it factual and concise — think of it as a quick briefing document for an AI assistant.
What is the difference between llms.txt and llms-full.txt?
llms.txt is the summary — your business and key pages in brief. llms-full.txt is the extended version with full service descriptions, complete URL inventory, detailed content summaries. AI systems check llms.txt first for a quick overview and llms-full.txt when they need deeper context.
What Schema.org markup helps with AI visibility?
The most impactful types: Organization (business identity), FAQPage (Q&A pairs), Article (author and publication info), BreadcrumbList (site hierarchy), and WebSite (site-level info). JSON-LD is the way to go. These help AI systems understand the structured meaning behind your text — not just the text itself.
How do I add JSON-LD schema to my website?
Add a <script type="application/ld+json"> tag in your page HTML containing the schema data as JSON. Place it in the <head> or end of <body>. For Next.js, create a reusable JsonLd component. For WordPress, use Yoast SEO or Schema Pro. JSON-LD is preferred over microdata or RDFa because it's cleanly separated from HTML.
How do I structure content for AI extraction?
Use question-format headings (H2/H3) that mirror how people ask AI assistants. Lead with direct answers in the first 1–2 sentences, then add supporting detail. Include FAQ sections with proper schema. Make sure everything is server-side rendered — AI crawlers don't execute JavaScript.
Does my robots.txt affect AI visibility?
Yes. GPTBot, ClaudeBot, and PerplexityBot all respect robots.txt directives. Block these crawlers and your content becomes invisible to their AI systems. Explicitly allowing them signals your content is available for citation. Check your robots.txt right now — we've seen sites unknowingly blocking AI bots.
Which AI crawler user-agents should I allow in robots.txt?
The main ones: GPTBot (OpenAI/ChatGPT), ChatGPT-User (ChatGPT browsing), ClaudeBot (Anthropic/Claude), anthropic-ai (Anthropic general), PerplexityBot (Perplexity AI), and Google-Extended (Google AI Overviews/Gemini). Add explicit Allow rules for content pages and Disallow for admin and API routes.
What is server-side rendering and why does it matter for AEO?
SSR means your HTML content is generated on the server before being sent to the browser. Most AI crawlers don't execute JavaScript — they read raw HTML. If your content is rendered client-side, AI crawlers see empty pages. Use SSR, SSG, or pre-rendering to make sure your content is in the HTML source.
How important is heading hierarchy for AI visibility?
Critical. AI systems use heading hierarchy (H1 → H2 → H3) to understand content structure and topic relationships. One H1 per page — the main topic. H2s for major sections, H3s for subsections. Question-format headings like "What is AEO?" are especially effective because they match how users query AI assistants.
What role do internal links play in AEO?
Internal links help AI systems map relationships between your content. Topic hubs, Related Articles sections, breadcrumbs, cross-references — they create a navigable content graph that AI crawlers follow. Strong internal linking distributes authority across pages and ensures important content is discoverable no matter where a crawler enters your site.
Should I use microdata, RDFa, or JSON-LD for structured data?
JSON-LD. It's cleanly separated from HTML — no inline attributes to maintain. It's easier to implement and debug. Google and most AI systems prefer it. Microdata and RDFa mix schema with HTML attributes, making them harder to maintain and more error-prone. JSON-LD goes in a single script tag and you're done.
How do I test if my structured data is valid?
Use Google's Rich Results Test or Schema Markup Validator to check syntax and required fields. Our audit also validates structured data as part of the Schema.org criterion. Common errors include missing required properties, wrong @type values, and orphaned schema blocks that reference entities not defined on the page.