seo101

The GEO Playbook

The previous guide covered how generative engines work. This one is the checklist: what to actually do, organized by layer.

Layer 1: Be retrievable#

AI answers start with retrieval from search indexes. Verify the basics:

  • Rank in the underlying indexes. Maintain your classic SEO; register with Bing Webmaster Tools (Bing powers ChatGPT search and Copilot).
  • Allow AI crawlers - deliberately. Each engine has distinct bots; decide policy per purpose:
robots.txt - common AI user agents
# Search/answer crawlers (citations & traffic)
User-agent: OAI-SearchBot      # ChatGPT search
Allow: /
User-agent: PerplexityBot
Allow: /
 
# Training-data crawlers (model training, no direct citation value)
User-agent: GPTBot
Allow: /                       # or Disallow - a business decision
 
# Google AI features use ordinary Googlebot; blocking it blocks search.
# Google-Extended controls Gemini *training* only.
  • Serve content without JavaScript. Several AI crawlers don't execute JS. Confirm with curl that your content is in the initial HTML.

The llms.txt convention#

llms.txt is an emerging proposal (from llmstxt.org) for a single markdown file at your site root, /llms.txt, that hands an LLM a clean, curated map of your most important pages - without making it crawl and parse your whole site. Think of it as a robots.txt and sitemap for language models, written in prose they read natively.

The format is plain markdown: an # H1 with your site name, a > blockquote summary, then ## sections of annotated links.

llms.txt
# seo101
 
> A free, in-depth curriculum covering SEO, AEO and GEO for web developers.
 
## Docs
- [What is SEO?](https://seo101.dev/docs/foundations/what-is-seo): search optimization from zero
- [What is GEO?](https://seo101.dev/docs/geo/generative-engine-optimization): how AI engines pick and cite sources
- [The GEO Playbook](https://seo101.dev/docs/geo/geo-playbook): concrete tactics to earn AI citations
 
## Optional
- [Full curriculum index](https://seo101.dev/docs): every guide, grouped by module

Two practical notes:

  • A companion /llms-full.txt is sometimes published with the entire content inlined as one markdown document, so a model can ingest everything in a single fetch.
  • Adoption by the major engines is not established - treat this as cheap insurance, not a ranking lever. It costs little to maintain, signals nothing negative, and positions you for tooling that does consume it. Do not expect it to move rankings on its own.

Layer 2: Be quotable#

Write so a model can lift a passage confidently:

  • Lead every section with the claim. Answer-first structure (the AEO pattern) is exactly what selection favors.
  • Put numbers in sentences. "Migrating to SSG cut our median LCP from 4.1s to 1.8s" is a citation magnet; "performance improved significantly" is filler.
  • Attribute claims - "according to CrUX data", with links. Passages that look verifiable get preferred.
  • Define your terms in single clean sentences. Definitions are the most-quoted content shape on the web.
  • Use real lists and tables - structure survives extraction; styled-div soup doesn't.
  • Publish original data. Surveys, benchmarks, teardown measurements - content that only you have forces engines to cite you as the primary source. This is the single strongest GEO tactic.

Layer 3: Be a clear entity#

Models assemble what they know about you from everywhere at once. Remove ambiguity:

  • Consistent naming of your brand, products and authors across your site and the web
  • Organization/Person JSON-LD with sameAs links to your profiles (structured data guide) so the graph connects
  • An "about" surface that answers who/what/since-when/for-whom in plain extractable language - models routinely retrieve about pages for grounding facts
  • Author expertise made explicit - bios, credentials, publication history; E-E-A-T's machine-readable shadow

Layer 4: Be present off-site#

Generative engines triangulate. Brands mentioned consistently across independent sources get recommended; ghosts don't:

  • Reddit and forums are disproportionately retrieved by AI search. Genuine participation where your topic is discussed (helpful answers, AMAs - not spam) shows up in answers.
  • Comparison and review coverage: "best X" listicles, review platforms (G2, Capterra for software), industry roundups - these pages are what engines retrieve when users ask "what's the best…". Get included in them.
  • Digital PR and data stories (link building tactics) now earn citations twice: once in classic rankings, again in model knowledge.
  • Wikipedia/Wikidata presence, where legitimately warranted, anchors entity identity.

The compounding loop#

GEO tactics reinforce each other:

original data → cited by publications → retrieved by engines
      ↑                                        ↓
 brand searches  ←  users see citations  ←  AI answers name you

Next: Measuring AI Visibility - proving any of this works.