This is the live page for our 2026 citation-share study. We track a fixed set of 1,200 prompts across the five generative engines that matter for the brands we work with, and report on which domains are absorbing the citation share.
What we track
- Five engines. ChatGPT, Perplexity, Claude, Gemini, Microsoft Copilot. We sample each twice daily; we do not publish results from engines we have less than 30 days of data on.
- Twelve verticals. SaaS, e-commerce, fintech, healthtech, edtech, B2B services, media, travel, automotive, real estate, legal, government. 100 prompts per vertical.
- Three intent layers per prompt. Informational (“what is…”), comparative (“X vs Y”), transactional (“best X for Y under $Z”).
Every prompt is human-curated and passes a relevance review every six weeks. The set is deliberately small enough to be defensible and large enough to be statistically meaningful.
Methodology, in one paragraph
Each prompt is run cold, with no system prompt and no persona, against each engine’s default model on the public web tier. Citations are extracted from the response either via the engine’s structured citation output (where one exists) or via URL detection inside the answer body. Domain-level deduplication is done at the eTLD+1 level, so blog.example.com and example.com are counted as the same domain. The full methodology, including how we handle citation order and weighting, is documented at /methodology.
Headline findings, current quarter
The numbers below are placeholders for the live study. We refresh them on the first business day of each month.
- Wikipedia is the most-cited source in 8 of 12 verticals, but its share has declined in 5 of those 8 over the past year as engines diversify.
- Reddit continues to climb. It is now in the top 5 most-cited sources in 9 of 12 verticals, up from 6 last year.
- First-party brand domains capture roughly 12% of citations in transactional prompts, but under 4% in informational prompts. The gap is the GEO opportunity.
- Small specialist publishers (under 50,000 monthly visitors) capture about 18% of all citations in long-tail prompts. The “long-tail GEO” thesis holds.
How this fits the GEO model
The study is the empirical backbone of the GEO ranking: the data we use to validate which tools’ citation-discovery capabilities actually map to citation reality. Tools that show you what is being cited but not who is being cited are not solving the GEO problem; they are solving the AEO problem with extra steps.
How to cite us
If you cite the study, please link to this page directly. We update the headline numbers monthly and historical snapshots are available on request.
Adjacent reading
- For the practical playbook see publisher playbook.
- For the benchmarks programs should hold themselves to see GEO benchmarks.