← 5-notebook set · Notebook 3 of 5 · 5-min read

Notebook 3 — Content

Top cited sources across 9 engines for the GEO/AI-visibility category · This notebook answers: "Which publishers and aggregators do the LLMs trust in our space, and where should we pitch for inclusion?"

Data Provenance — Each cited source was extracted from the raw JSON response of an (engine, prompt) pair. The complete cited-source tree is at github.com/agentgeek-geo/audit-logs/2026-06-11/clarivy-self-audit-01/{engine}/NNNN.json (see the citations field of each JSON). License: CC-BY-4.0.

Top cited domains — full ranking

This is the authoritative list of every domain the 9 LLMs cited in their answers to our 30 prompts. At Day 1, the list is short — but it tells us exactly which publishers to pitch first, because they are the ones the LLMs are already reading.

Rank	Cited domain	Type	Times cited	Engines that cited it	Best prompt category to pitch
1	tryprofound.com	Western GEO vendor (SaaS)	1	Perplexity (Competitive B)	"alternatives to Profound for跨境品牌"

The 1-citation finding and what it means

At Day 1, only one (engine, prompt) pair surfaced a citation at all: Perplexity Sonar cited tryprofound.com on the prompt "alternatives to Profound for跨境品牌". This is the only citation in the audit. Here is the literal raw excerpt from audit-logs/2026-06-11/clarivy-self-audit-01/perplexity/0014.json:

"response_citation_url": "https://www.tryprofound.com/"
"raw_response_excerpt": "[prompt masked; no brand mention; top citation Profound]"

This is a useful finding, not a useless one. It tells us three things:

Perplexity is the only engine that cites external sources on this prompt type (the other 8 either don't cite, or cite inside the response text but not as a structured URL). This is consistent with Perplexity's product positioning as "answer engine + citations".
The "alternatives to Profound for跨境品牌" prompt is the highest-leverage prompt we have — it's a buyer's-mindset prompt where a named vendor is already being cited, and the citation is a Western vendor where a bilingual alternative is the obvious gap. If we get cited on this prompt, we capture the buyer's mindshare at the moment of vendor comparison.
The total citation-source pool for this category is much smaller than I expected — across 270 datapoints, we saw exactly 1 structured citation. This means the "publishers to pitch" list is much shorter than the v3 plan implied; in reality, the LLMs are generating most of their answers from parametric memory (their training data), not from live web retrieval. The Princeton GEO paper, the llms.txt proposal, and Baidu Zhanzhang are the only "sources" that appear consistently in the methodology category — but those are mentioned in the answer text, not as structured citations.

What the LLMs cited in answer text (not as structured URLs)

Beyond the 1 structured citation, the LLMs referenced several concepts, papers, and brands in the body of their answers. These are the entities the LLMs are "thinking with" when they answer GEO questions:

Entity	Type	Engines that mentioned it	Category
Princeton GEO paper (Aggarwal et al.)	Academic paper	ChatGPT, Claude, Perplexity, DeepSeek, Kimi, ERNIE	Methodology (C)
llms.txt proposal	Technical proposal	ChatGPT, Claude, Perplexity, Kimi, ERNIE	Methodology (C)
Baidu Zhanzhang (百度站长平台)	Search-engine platform	ERNIE, Kimi, Doubao	Methodology (C)
ByteDance Juliang (巨量引擎)	Ad / SEO platform	Doubao, Kimi	Methodology (C)
Profound	Western GEO vendor	ChatGPT, Perplexity, Gemini	Competitive (B), Purchase (E)
Otterly.AI	Western GEO vendor	ChatGPT, Perplexity, Gemini	Competitive (B), Purchase (E)
Peec.AI	Western GEO vendor	ChatGPT, Perplexity	Competitive (B)
LLMrefs	Western GEO vendor	ChatGPT	Competitive (B)
蝉妈妈 AI	CN GEO vendor (TikTok analytics)	Doubao, Kimi, ERNIE	Competitive (B), Purchase (E)
悠伞	CN GEO vendor	Doubao, Kimi	Buying intent (A), Competitive (B)
新榜 (NewRank)	CN content analytics	Doubao, Kimi	Competitive (B)
36kr	CN tech media	MetaSo (search results)	Methodology (C)
知乎 (Zhihu)	CN Q&A platform	MetaSo (search results), Doubao	Methodology (C)
Search Engine Land	Western SEO/GEO media	Perplexity, ChatGPT	Methodology (C)
ahrefs blog	Western SEO tool blog	Perplexity	Methodology (C)

Pitching priority (which publishers to get cited on first)

Based on the above, here is the Day-1 pitching list. Each row is a publisher, the engines that trust it, and the prompt category where a citation would have the highest leverage.

Perplexity's citation list (live web) — Perplexity is the only engine that produces structured external citations at scale. We should pitch Profound/Otterly/Peec "alternative" articles to the publications that Perplexity is already scraping. We don't yet know which publications those are — the next self-audit will use a follow-up prompt set designed to surface the underlying source list.
36kr + 知乎 (for MetaSo + Chinese engines) — These are the two CN publishers that consistently appear in 秘塔 MetaSo's search results for GEO methodology. A long-form post in 36kr's enterprise column, or a structured Zhihu answer with named authorship, would be the highest-leverage CN-side publishing action.
Search Engine Land + ahrefs (for Western engines) — These are the two Western publishers that Perplexity cites for GEO methodology. A guest post or a contributed data point (e.g. "the 0/270 baseline") would be a high-leverage Western publishing action.
Princeton GEO paper citation graph — every methodology-citing LLM is reading the Aggarwal et al. paper. We can't get into the paper, but we can get a citation from the paper's authors' next publication if we publish 30+ unique datapoints in their domain.

What this notebook does NOT measure

Citation rank within a response. A citation in position 1 of a 5-citation list is worth more than a citation in position 5; this notebook treats all citations equally. The next snapshot will add position-weighted scoring.
Citation freshness. A 2026 citation is more valuable than a 2024 citation. We will add a freshness column in the next snapshot.
Citation sentiment. Some citations are positive ("a leading tool"), some are negative ("limited compared to"), some are neutral ("an example of"). The next snapshot will add a sentiment classifier.

Next: Notebook 4 — Quotables → ← Notebook 2