Methodology

v1.0 · 11 June 2026 · This page is updated whenever the method changes. See Day-1 self-audit for a worked example →

1. The 9 engines we cover (fixed list)

We commit to a fixed list. We will not silently swap engines without notice, and we will not add a 10th engine without versioning this page.

#	Engine	Lang	Method	ZDR?
1	豆包 Doubao (ByteDance)	zh	Volcengine Ark API	Yes
2	Kimi (Moonshot AI)	zh	Moonshot API	Yes
3	DeepSeek	zh	DeepSeek API	Yes
4	文心一言 ERNIE (Baidu)	zh	Qianfan API (enterprise)	Yes
5	秘塔 MetaSo (Shanghai Xiyu)	zh	Playwright UI capture (no public API)	n/a (UI scrape, no prompt sent to vendor)
6	ChatGPT (OpenAI)	en	OpenAI API · gpt-4o	Yes
7	Perplexity Sonar	en	Sonar API · online mode	30-day default (commercial DPA + ZDR available)
8	Claude (Anthropic)	en	Anthropic API · claude-3-5-sonnet	Yes
9	Gemini + Google AI Overviews	en	Vertex AI · gemini-2.0-flash	Per Google Cloud DPA

2. The 4-column methodology matrix

Every datapoint in every audit carries these 4 columns. If any cell is empty, the datapoint is rejected before the report is drafted.

Column	What it is	Example
Engine	The exact engine called, including model version (no "gpt-4" — must be "gpt-4o-2024-08-06")	"OpenAI gpt-4o-2024-08-06"
Method	API call vs. UI capture vs. search-engine scrape; ZDR status; proxy used (if any)	"API · ZDR"
Sampled at	ISO 8601 timestamp with timezone (UTC+8 by default)	"2026-06-11T09:00:00+08:00"
Reproduce	Direct URL to the raw JSON in the audit-logs GitHub repo	"github.com/.../2026-06-11/chatgpt-001.json"

3. The 30-prompt matrix (5 categories × 5–7 prompts)

The full prompt set is at /audit/self-audit-01.html. Categorically:

Buying intent (5–7 prompts): "best GEO audit service for [vertical] in [year]" variants. Surfaces vendor-recognition patterns.
Competitive (5–7): "[Vendor A] vs [Vendor B] vs [Vendor C]" variants. Surfaces comparative-recognition and opportunity gaps.
Methodology (5–6): "what is [concept]" / "how to do [task]" variants. Surfaces citation-source patterns for educational queries.
Brand-specific (5): "[your brand] [predicate]" variants. Surfaces entity-recognition status.
Purchase intent (5): "GEO audit pricing", "is GEO audit worth it" variants. Surfaces funnel-bottom visibility.

For Enterprise ($1,499), we extend each category to 24 prompts (120 total) and add 3 more languages: 繁中, 日本語, 한국어.

4. The 5-notebook deliverable structure

Each Standard Audit ($299) ships as 5 independent PDF notebooks rather than one long PDF. Each can be read in 5 minutes, updated independently, and shared with a different stakeholder (CMO, content lead, dev lead, legal, etc.).

Notebook 1 — Index. Score table, mention counts, top citation sources, competitor deltas. The one-page exec summary.
Notebook 2 — Intent. Per-category breakdown of which prompt types the brand wins, ties, or loses on. Tells you which queries to optimize for first.
Notebook 3 — Content. Top 20 cited sources across the 9 engines for your category. Tells you which publishers and aggregators the LLMs trust for your space.
Notebook 4 — Quotables. Verbatim phrases from LLM answers where your brand is (or is not) mentioned. Actionable for content writers — they can pattern-match the language.
Notebook 5 — Strategy. 5–10 prioritized actions: schema.org additions, llms.txt entries, content gaps, link targets, FAQ candidates. Each action has an effort estimate and an expected citation-rate lift.

5. The 6 anti-hallucination rules

These are the hard rules that gate the report from going to the customer. A draft that violates any of them is rewritten before delivery.

No absolute numbers in headlines. "5 of 30 prompts mention you" — not "You are mentioned 16.7% of the time." Numbers are exact; percentages imply false precision.
Every claim cites a raw JSON line. "Perplexity cited tryprofound.com on prompt #14" must link to snapshots/2026-06-11/perplexity-014.json.
No "best" / "only" / "guaranteed." These are superlatives. We do not assert them about our own product, and we do not assert them about competitors in customer reports.
Disclose engine stochasticity. Every report includes a "this is a snapshot" disclaimer with the ±10% expected re-run variance.
Distinguish "no result" from "no data." "Perplexity returned no result for prompt #18" is different from "We did not call Perplexity for prompt #18." We never silently conflate them.
No automated decisions. We score; humans (us) interpret. We do not auto-reject a brand from being audited because of low mention counts; we report the data and let the customer decide.

6. What the audit does not claim

It does not predict future rankings. AI search engines change behaviour monthly.
It does not claim that acting on the recommendations will produce a specific lift. The Princeton 2023 paper reports an aggregate +40% in a controlled setting; individual results vary widely.
It does not cover every AI engine that exists. We commit to 9; we will not advertise "12+" or "all major engines."
It does not score on ranking position within a single response (yet). v2 will introduce position-weighted scoring.

7. How to read a Clarivy report

Open Notebook 1 (Index) first. The score table tells you where you stand in one minute.
Read Notebook 4 (Quotables) next. The verbatim LLM answers are the most actionable artifact for content teams.
Read Notebook 5 (Strategy) last. The action list is prioritized by expected citation-rate lift, not by implementation ease.
Notebooks 2 (Intent) and 3 (Content) are reference material — read them when you have a specific question.

8. Versioning & change log

v1.0 (2026-06-11) — Initial 9-engine matrix, 30-prompt set, 5-notebook structure, 6 anti-hallucination rules.

Substantive changes will be announced on the blog and reflected here.