GEO Optimization Guide — 전체 시리즈

  1. 1. What Is GEO - AI Citation Strategy Beyond SEO
  2. 2. Each AI Cites Different Sources
  3. 3. On-Site GEO Technical Architecture - From Product DB to JSON-LD
  4. 4. Off-Site GEO - How to Win Over AI That Ignores Your Official Site ← 현재 글
  5. 5. AEO - Why Coding Agents Read Documentation Differently

You Added JSON-LD, So Why Is a Blog Getting Cited Instead?

We covered On-Site GEO through Part 3 . Pulled JSON-LD from the product database, injected it into the HTML <head> via SSR, validated with Rich Results Test. Technically, nothing was missing.

Then we asked ChatGPT to “recommend products from brand ○○.” It cited a Naver blog and TripAdvisor instead of the official site. On Perplexity, a Reddit thread showed up as the source.

No matter how well you build your own site, if AI primarily looks at external channels, the impact is cut in half.

Here is the platform-by-platform citation source data from Part 2 :

PlatformTop Citation SourceShare
ChatGPTDirectories/Listings (Yelp, G2, etc.)49%
PerplexityReddit/Communities31%
GeminiOfficial websites52%
Google AIOYouTube#1 domain

Off-Site GEO Channel Map

Except for Gemini, official websites do not dominate. Half of ChatGPT’s citations come from external directories. If you are not managing those channels, you are leaving half of your citation share on the table.

That is Off-Site GEO.

How Off-Site GEO Differs

Part 1 briefly distinguished On-Site from Off-Site. Let’s dig deeper.

On-Site GEO is about making your own site easy for AI to read. JSON-LD, Schema.org, SSR. The engineering team fixes code and ships it.

Off-Site GEO is about managing your brand across the external channels that AI actually references. Directory profiles, community mentions, YouTube videos. Marketing and PR have to drive this.

DimensionOn-Site GEOOff-Site GEO
TargetYour own domainExternal platforms
Core techniquesJSON-LD, SSR, FAQ SchemaDirectory management, communities, YouTube
OwnerEngineeringMarketing / PR / Brand
Control levelHigh (direct edits)Low (indirect influence)
Effective onGemini (52%)ChatGPT, Perplexity, AIO

You cannot do just one. Raise the quality of official data with On-Site, and align brand consistency across external channels with Off-Site. They work as a set.

Platform-Specific Off-Site Strategies

ChatGPT: Directories and Listings Make Up Half

49% of ChatGPT citations come from third-party directories like Yelp, TripAdvisor, G2, and Capterra (Yext). Directory profiles get cited before your own website.

Why? ChatGPT has a weak native search index. It relies on Bing’s search layer, and Bing assigns high domain authority to directory sites. Information listed on directories reaches ChatGPT’s answers first.

What you can do right away:

  • Check whether your profiles exist on key directories for your industry (Yelp, Google Business, G2, Capterra, TripAdvisor). Create them if missing, update them if stale
  • Ensure NAP consistency. Name, Address, Phone must be identical across all directories. If “Company Inc.” and “Company Corp.” are mixed, AI may treat them as separate entities
  • Manage reviews. AI uses review count and rating as trust signals. A profile with zero reviews is unlikely to be cited

Perplexity: Reddit and Communities Are the Source

31% of Perplexity citations come from community threads, including Reddit. It trusts real user discussions over official announcements.

This does not mean you should just post on Reddit. The reason Perplexity favors Reddit is that its question-and-answer structure is optimized for AI parsing. “What do you think of this product?” → “Used it for 6 months, ○○ is great but ○○ not so much.” This kind of dialogue is the easiest format for AI to cite.

What to focus on:

  • Identify subreddits where your brand or category is discussed. Monitor them regularly
  • Contribute genuinely useful answers to product-related questions. Promotional posts get downvoted immediately on Reddit
  • For the Korean market, the dynamics differ. Instead of Reddit, communities like DCInside, Clien, and Ppomppu play a similar role. Data on how much Perplexity cites these sites for Korean-language queries is still scarce. This area needs hands-on testing

Google AI Overview: YouTube Is Surging

YouTube is the #1 cited domain in Google AI Overview (Ahrefs Brand Radar). Its share grew 34% in just six months.

As we noted in Part 2, the characteristics of cited videos are surprising. Videos with under 1,000 views get cited. Plenty have just a few dozen likes. What AI looks at is not popularity but how well the information is organized.

Common elements in videos that get cited:

ElementDescriptionCitation Impact
Timestamps/ChaptersTopic segments within the videoHigh
Structured descriptionTable of contents, links, key takeawaysHigh
Clear titleQuestion-based or “How to” formatMedium
Subtitles/TranscriptEven auto-generated ones enable parsingMedium
View count/LikesPopularity metricsLow

Even for videos you have already uploaded, adding timestamps to the description bumps up AI citation potential. Lay out a table of contents like “What this video covers: 1. ○○ 2. ○○” and place relevant links. Titles with clear search intent like “How to ○○” or “○○ vs ○○ Comparison” tend to do better.

But First: Check Your robots.txt

Before diving into Off-Site, there is one thing to verify. Make sure your own site is not blocking AI crawlers.

If you block GPTBot or PerplexityBot in robots.txt, those AI engines cannot crawl your site. Your On-Site GEO could be flawless, but if AI cannot read it, none of it matters.

We built a tool that lets you run the competitor robots.txt analysis discussed in Part 2 hands-on. Feed it a list of domains and it shows allow/block status for 10 AI crawlers as a heatmap.

Google Colab에서 실습하기

It runs on Python’s standard library alone, no API keys needed. Swap in competitor domains to map out your entire industry.

What You Can Read from robots.txt

If a competitor is blocking GPTBot, your chances of getting cited on that AI platform go up relatively. It is a gap you can fill.

Conversely, if competitors have fully opened up and you are the only one blocking, only competitors show up in AI search results while you are invisible.

One thing worth knowing: blocking GPTBot does not block ChatGPT-User (browsing mode), which is a separate User-Agent. Browsing mode may still access your site. Blocking Google-Extended does not affect the base Googlebot. You can keep search visibility while blocking AI training specifically.

# Allow search, block AI training only
User-agent: Googlebot
Allow: /

User-agent: Google-Extended
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Allow: /

With this setup, your site appears normally in Google search but is excluded from Gemini AI training and ChatGPT training data. ChatGPT browsing mode is still allowed, keeping real-time citation possible.

Off-Site Channel Priorities by Industry

The same strategy does not work for every industry. The external channels AI references most vary by sector.

Industry#1 Off-Site Channel#2Notes
E-commerce/RetailGoogle Business + DirectoriesYouTube reviewsBalancing catalog protection vs AI exposure
SaaS/B2BG2, Capterra reviewsReddit (r/SaaS, etc.)Review count directly drives citation likelihood
Hotels/TravelTripAdvisor, BookingYouTube toursFreshness of pricing/availability data is key
Food/Consumer goodsCommunity reviewsYouTube food/review contentIn Korea, Naver blogs still carry significant weight
Finance/FintechNews/MediaSpecialized forumsMany block AI crawlers due to regulatory concerns

E-commerce is especially tough. Expose product prices and inventory to AI, and competitors can scrape it in real time. Block it, and you vanish from AI search. Where to draw that line is the critical GEO decision for retail.

Off-Site GEO Checklist

Starting with what you can execute immediately:

This week

  • Check whether your robots.txt blocks AI crawlers → Diagnose with the Colab analyzer
  • Compare robots.txt across 3 competitors
  • Verify profile existence on major directories (Google Business, industry-specific directories)

This month

  • Update directory profile information (verify NAP consistency)
  • Add timestamps/chapters/structured descriptions to existing YouTube videos
  • Build a list of communities/subreddits where your brand is mentioned

This quarter

  • Establish per-platform AI citation monitoring
  • Audit brand consistency across Off-Site channels
  • Redesign robots.txt policy to align with GEO strategy