GEO Optimization Guide — 전체 시리즈 1. What Is GEO - AI Citation Strategy Beyond SEO 2. Each AI Cites Different Sources 3. On-Site GEO Technical Architecture - From Product DB to JSON-LD 4. Off-Site GEO - How to Win Over AI That Ignores Your Official Site ← 현재 글 5. AEO - Why Coding Agents Read Documentation Differently You Added JSON-LD, So Why Is a Blog Getting Cited Instead? We covered On-Site GEO through Part 3 . Pulled JSON-LD from the product database, injected it into the HTML via SSR, validated with Rich Results Test. Technically, nothing was missing. Then we asked ChatGPT to “recommend products from brand ○○.” It cited a Naver blog and TripAdvisor instead of the official site. On Perplexity, a Reddit thread showed up as the source. No matter how well you build your own site, if AI primarily looks at external channels, the impact is cut in half. Here is the platform-by-platform citation source data from Part 2 : Platform Top Citation Source Share ChatGPT Directories/Listings (Yelp, G2, etc.) 49% Perplexity Reddit/Communities 31% Gemini Official websites 52% Google AIO YouTube #1 domain Except for Gemini, official websites do not dominate. Half of ChatGPT’s citations come from external directories. If you are not managing those channels, you are leaving half of your citation share on the table. That is Off-Site GEO. How Off-Site GEO Differs Part 1 briefly distinguished On-Site from Off-Site. Let’s dig deeper. On-Site GEO is about making your own site easy for AI to read. JSON-LD, Schema.org, SSR. The engineering team fixes code and ships it. Off-Site GEO is about managing your brand across the external channels that AI actually references. Directory profiles, community mentions, YouTube videos. Marketing and PR have to drive this. Dimension On-Site GEO Off-Site GEO Target Your own domain External platforms Core techniques JSON-LD, SSR, FAQ Schema Directory management, communities, YouTube Owner Engineering Marketing / PR / Brand Control level High (direct edits) Low (indirect influence) Effective on Gemini (52%) ChatGPT, Perplexity, AIO You cannot do just one. Raise the quality of official data with On-Site, and align brand consistency across external channels with Off-Site. They work as a set. Platform-Specific Off-Site Strategies ChatGPT: Directories and Listings Make Up Half 49% of ChatGPT citations come from third-party directories like Yelp, TripAdvisor, G2, and Capterra (Yext). Directory profiles get cited before your own website. Why? ChatGPT has a weak native search index. It relies on Bing’s search layer, and Bing assigns high domain authority to directory sites. Information listed on directories reaches ChatGPT’s answers first. What you can do right away: Check whether your profiles exist on key directories for your industry (Yelp, Google Business, G2, Capterra, TripAdvisor). Create them if missing, update them if stale Ensure NAP consistency. Name, Address, Phone must be identical across all directories. If “Company Inc.” and “Company Corp.” are mixed, AI may treat them as separate entities Manage reviews. AI uses review count and rating as trust signals. A profile with zero reviews is unlikely to be cited Perplexity: Reddit and Communities Are the Source 31% of Perplexity citations come from community threads, including Reddit. It trusts real user discussions over official announcements. This does not mean you should just post on Reddit. The reason Perplexity favors Reddit is that its question-and-answer structure is optimized for AI parsing. “What do you think of this product?” → “Used it for 6 months, ○○ is great but ○○ not so much.” This kind of dialogue is the easiest format for AI to cite. What to focus on: Identify subreddits where your brand or category is discussed. Monitor them regularly Contribute genuinely useful answers to product-related questions. Promotional posts get downvoted immediately on Reddit For the Korean market, the dynamics differ. Instead of Reddit, communities like DCInside, Clien, and Ppomppu play a similar role. Data on how much Perplexity cites these sites for Korean-language queries is still scarce. This area needs hands-on testing Google AI Overview: YouTube Is Surging YouTube is the #1 cited domain in Google AI Overview (Ahrefs Brand Radar). Its share grew 34% in just six months. As we noted in Part 2, the characteristics of cited videos are surprising. Videos with under 1,000 views get cited. Plenty have just a few dozen likes. What AI looks at is not popularity but how well the information is organized. Common elements in videos that get cited: Element Description Citation Impact Timestamps/Chapters Topic segments within the video High Structured description Table of contents, links, key takeaways High Clear title Question-based or “How to” format Medium Subtitles/Transcript Even auto-generated ones enable parsing Medium View count/Likes Popularity metrics Low Even for videos you have already uploaded, adding timestamps to the description bumps up AI citation potential. Lay out a table of contents like “What this video covers: 1. ○○ 2. ○○” and place relevant links. Titles with clear search intent like “How to ○○” or “○○ vs ○○ Comparison” tend to do better. But First: Check Your robots.txt Before diving into Off-Site, there is one thing to verify. Make sure your own site is not blocking AI crawlers. If you block GPTBot or PerplexityBot in robots.txt, those AI engines cannot crawl your site. Your On-Site GEO could be flawless, but if AI cannot read it, none of it matters. We built a tool that lets you run the competitor robots.txt analysis discussed in Part 2 hands-on. Feed it a list of domains and it shows allow/block status for 10 AI crawlers as a heatmap. Google Colab에서 실습하기 It runs on Python’s standard library alone, no API keys needed. Swap in competitor domains to map out your entire industry. What You Can Read from robots.txt If a competitor is blocking GPTBot, your chances of getting cited on that AI platform go up relatively. It is a gap you can fill. Conversely, if competitors have fully opened up and you are the only one blocking, only competitors show up in AI search results while you are invisible. One thing worth knowing: blocking GPTBot does not block ChatGPT-User (browsing mode), which is a separate User-Agent. Browsing mode may still access your site. Blocking Google-Extended does not affect the base Googlebot. You can keep search visibility while blocking AI training specifically. # Allow search, block AI training only User-agent: Googlebot Allow: / User-agent: Google-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Allow: / With this setup, your site appears normally in Google search but is excluded from Gemini AI training and ChatGPT training data. ChatGPT browsing mode is still allowed, keeping real-time citation possible. Off-Site Channel Priorities by Industry The same strategy does not work for every industry. The external channels AI references most vary by sector. Industry #1 Off-Site Channel #2 Notes E-commerce/Retail Google Business + Directories YouTube reviews Balancing catalog protection vs AI exposure SaaS/B2B G2, Capterra reviews Reddit (r/SaaS, etc.) Review count directly drives citation likelihood Hotels/Travel TripAdvisor, Booking YouTube tours Freshness of pricing/availability data is key Food/Consumer goods Community reviews YouTube food/review content In Korea, Naver blogs still carry significant weight Finance/Fintech News/Media Specialized forums Many block AI crawlers due to regulatory concerns E-commerce is especially tough. Expose product prices and inventory to AI, and competitors can scrape it in real time. Block it, and you vanish from AI search. Where to draw that line is the critical GEO decision for retail. Off-Site GEO Checklist Starting with what you can execute immediately: This week Check whether your robots.txt blocks AI crawlers → Diagnose with the Colab analyzer Compare robots.txt across 3 competitors Verify profile existence on major directories (Google Business, industry-specific directories) This month Update directory profile information (verify NAP consistency) Add timestamps/chapters/structured descriptions to existing YouTube videos Build a list of communities/subreddits where your brand is mentioned This quarter Establish per-platform AI citation monitoring Audit brand consistency across Off-Site channels Redesign robots.txt policy to align with GEO strategy