What an llms.txt looks like
# Yokaify
> Yokaify is the Onsite Conversion Agent: an animated AI character
> that watches visitor behavior in real time and steps in at the right
> moment to help them convert.
## Core pages
- [Onsite Conversion Agent: 2026 field guide](https://yokaify.com/guides/onsite-conversion-agent.md)
- [Proactive chat in 2026](https://yokaify.com/guides/proactive-chat.md)
- [Cart abandonment recovery playbook](https://yokaify.com/guides/cart-abandonment.md)
## Tools
- [Cart abandonment calculator](https://yokaify.com/tools/cart-abandonment-calculator.md)
- [Mascot ROI calculator](https://yokaify.com/tools/mascot-roi-calculator.md)
- ...
A few conventions to note:
- Hash-prefixed sections group related links by purpose.
- Markdown links point to the canonical URL, often with a
.mdsuffix (a proposed convention; some sites just use the regular URL). - A blockquote summary at the top gives a crawler a quick sense of what the site is.
Why ship an llms.txt
- It is cheap. Generate it from your sitemap, then pick out the pages that matter most. An afternoon for most sites.
- It is a hedge. AI crawlers will likely start reading it eventually, and early adopters benefit when they do.
- Lighthouse may notice. Google's experimental
llms.txtaudit could become a signal, and the spec authors are paying attention.
What to put in it
Lead with a two- or three-sentence summary that names the brand, the category, and the main value. Then link the pages worth reading first: your pillar guides, your tools, your research and data-rich articles, your strongest comparison pages, and your best glossary entries. A curated file usually lands somewhere around 80-120 entries. Bigger is not better here; the curation is the point.
Where llms.txt sits in the AI-crawler stack
| Surface | What it does | Adoption |
|---|---|---|
| robots.txt | Allow / disallow crawler access | Universal |
| sitemap.xml | Comprehensive URL index for crawlers | ~85% of top-10k sites |
| Schema.org markup | Per-page structured data | ~50-60% of top-10k sites |
| llms.txt | Curated AI-crawler index | 5-10% of top-10k sites |
It is the newest and least-adopted of the four, so the others still matter more right now.
How it differs from related standards
- robots.txt. Sets allow and disallow rules. It is not a curated content list.
- sitemap.xml. An exhaustive URL index.
llms.txtis curated by comparison. - Schema.org / JSON-LD. Per-page structured data.
llms.txtworks at the site level.
Related terms
- GEO — the broader discipline llms.txt supports
- Concept density — a neighboring content-quality signal
- Citation grounding — what AI engines do with crawled content
See also
- The Yokaify SEO/GEO strategy field guide — the broader AEO playbook
- Webflow AEO vs Shopify AEO — platform AEO context
First defined: June 1, 2026. Adoption rate from 2026 GEO research aggregators; standard reference: llmstxt.org. Content paraphrased for compliance with licensing restrictions.