GEO Optimization Guide — 전체 시리즈

  1. 1. What Is GEO - AI Citation Strategy Beyond SEO
  2. 2. Each AI Cites Different Sources
  3. 3. On-Site GEO Technical Architecture - From Product DB to JSON-LD
  4. 4. Off-Site GEO - How to Win Over AI That Ignores Your Official Site
  5. 5. AEO - Why Coding Agents Read Documentation Differently ← 현재 글

AI Does Not Read Just One Kind of Document

Through Part 4 , we covered On-Site and Off-Site GEO — JSON-LD on your official site, external directories, community channels. The goal was to make your brand appear as a cited source when consumers ask something on ChatGPT or Perplexity.

But that is not the only kind of document AI reads.

When a developer tells Claude Code or Cursor to “integrate this API,” the agent crawls the API docs on its own. The way an agent processes those documents is fundamentally different from how a person browses a page.

This is called AEO (Agentic Engine Optimization). The concept was formalized recently , so there is almost no discussion of it in Korea yet — but for anyone doing GEO, it is worth understanding.

How AEO Differs From GEO

The same AI calls for different optimization depending on who is using it.

ItemGEOAEO
TargetChatGPT, Perplexity, GeminiClaude Code, Cursor, Cline, Aider
ConsumerPerson asking questions (indirect)Code-writing agent (direct)
Content typeProduct pages, brand informationAPI docs, developer portals
Key formatJSON-LD, Schema.orgMarkdown, llms.txt, skill.md
MetricCitation rateToken efficiency, parse success rate
Failure modeNot appearing in answersAgent making wrong API calls

There are overlapping principles. SSR-based serving, robots.txt review, structured content — all three need to be in place for both. Organizations that have properly set up GEO face a lower barrier to entering AEO.

GEO asks “is the content cited in answers?” AEO asks “is the agent using the API correctly?” When the latter fails, broken code ships without anyone noticing.

Agents Do Not Read Documentation Like Humans

When a person lands on a developer portal, they scan the menu, click Getting Started, try running the sample code, and follow a few related links over 4–8 minutes. All of this behavior gets recorded in analytics.

An agent fetches the page in one or two HTTP GET requests, parses it, and moves on. No scrolling, no clicking. In GA, it shows up as a single request with a 400ms session duration.

You can identify agents from server logs via User-Agent.

AgentUser-Agent
Claude Codeaxios/1.8.4
Cursorgot (sindresorhus/got)
Cline, Juniecurl/8.4.0
Windsurfcolly
Aider, OpenCodeHeadless Chromium (Playwright)

A significant share of what previously showed up as “unknown crawlers” may well be these agents.

Agents Don’t Read Long Documents to the End

Agents have context limits. Claude and GPT-class models typically sit between 100K and 200K tokens. When a single document approaches or exceeds that window, the agent quietly does one of the following.

  • Truncates the tail. If the important content was near the end, the answer is wrong
  • Moves to a shorter alternative document
  • Spends time chunking, adding latency, and producing errors
  • Gives up and answers from its trained knowledge — which is hallucination

Some API reference documents exceed 100K tokens. At that scale, a single document can consume the agent’s entire context window by itself.

Document length therefore becomes a metric. Recommended guidelines:

Content typeToken target
Quick Start / Getting StartedUnder 15K
Individual API referenceUnder 25K
Concept guideUnder 20K (link out for details)
Full API referenceSplit by resource or endpoint

GEO has no such constraint. Consumer-facing AI just extracts the snippets it needs. AEO requires the entire document to fit into context, so length needs to be designed intentionally.

Four Files to Start With

No complex technology required — four files are enough to get started.

First, robots.txt. Same file covered in Part 4, but now coding agent User-Agents also need to be considered. An overly broad block will prevent agents from reading your docs at all.

Next, llms.txt. An agent-facing sitemap in Markdown, served at /llms.txt. Rather than page titles, it describes what a reader will learn at each URL. Including token counts per page lets agents decide upfront whether to read a given document.

# MyService Documentation

## Getting Started
- [Quick Start](/docs/quickstart): First API call in 5 minutes (8K tokens)
- [Authentication](/docs/auth): OAuth 2.0 and API Key auth (12K tokens)

## API Reference
- [Users API](/docs/api/users): User CRUD operations (12K tokens)
- [Events API](/docs/api/events): Event streaming and webhooks (8K tokens)

skill.md is a file that declaratively states what a service “can do.” It lets an agent understand your capabilities without reading through long documentation. A basic structure has four sections: What I can accomplish, Required inputs, Constraints, and Key documentation.

Finally, AGENTS.md. A README variant for coding agents, placed in the project root. It is typically the first file a coding agent looks for when opening a project. Many open-source projects are already adopting it as a standard starting point.

Who in Korea Should Actually Care About AEO

Honestly, most Korean companies are not the primary audience for AEO right now. Developer portals are not that common. For e-commerce, distribution, finance, and service businesses, well-executed consumer GEO will deliver far greater returns.

AEO is worth taking seriously in these cases:

  • Companies with public APIs: Payment gateways, logistics, maps, authentication, and data API providers. Developers using Cursor to write integration code is already the default
  • Large corporations running internal dev platforms: Shared group APIs, auth gateways, and internal data platform docs are natural targets
  • Open source and developer tools: If you have a public GitHub project, AGENTS.md is close to mandatory
  • MCP server providers: skill.md maps directly to the expected convention

When we tried connecting an internal data platform to agents, the agents frequently misread internal wikis and metadata, producing garbled answers. Applying a few AEO principles made a measurable difference. Serving docs as Markdown, adding token counts to pages, and stripping navigation noise was enough to produce a noticeable improvement in parse quality.

Four files will not solve everything. Organization-specific terminology and department-level tacit context remain. Those are a different class of problem that file design alone cannot fix.

The Incremental Cost Is Low If You Have GEO in Place

If GEO is already built, the implementation cost is smaller than it looks. A lot of the work overlaps.

  • Review robots.txt through an AI crawler lens (GPTBot, ClaudeBot, PerplexityBot, plus coding agent User-Agents)
  • Calculate token counts per documentation page (approximate as character count ÷ 4)
  • Draft /llms.txt with your key document list and token counts
  • Write skill.md for your top 3–5 APIs
  • Add AGENTS.md to internal GitHub repositories
  • Serve developer docs as Markdown — e.g., returning raw Markdown when .md is appended to the URL
  • Segment coding agent traffic in server logs to establish a baseline

Reviewing robots.txt and drafting llms.txt can be done in half a day. skill.md and AGENTS.md, starting with just a handful of top APIs, are not a heavy lift either.