Field note·Citation patterns·25 May 2026·4 min read

Topical fit and list position decide which source LLMs cite first

When two pages compete inside a RAG pipeline, the one cited first wins the framing and the click. A new study isolates why.

252,000

Paired RAG citation trials run

arXiv study across six LLMs, 2025

Key takeaways

Topical relevance to the exact prompt is the single biggest driver of winning the first citation.
List position inside the retrieved candidate set creates measurable bias, even after the retriever ranks pages.
Explicit pricing and a recent timestamp give pages a further boost in citation odds.
Brand strength was anonymized in the test and still lost to better-matched content. Topical fit beats brand equity inside the model.
Long flagship PDFs lose to single-question web pages engineered around one query.

What happened

Per arXiv, a new study on competitive Generative Engine Optimization ran 252,000 paired trials across six large language models to answer a narrow but valuable question: when two retrieved pages compete inside a RAG pipeline, what makes one of them earn the first citation marker in the generated answer?

The researchers built a controlled two-document testbed, anonymized brand names, counterbalanced source order, and varied exactly one of 18 content factors per trial. Mixed-effects models then isolated which factors actually move the needle. The two biggest drivers: topical relevance and list position. Secondary boosts came from including explicit pricing and a recent timestamp.

Two findings deserve emphasis. First, position bias is real even after the retriever has done its job; being the first candidate handed to the model meaningfully increases the odds of being the first citation. Second, "relevance" here is not a vague concept. It is the degree to which a single page matches the specific intent of the prompt, not the breadth of a domain's authority.

Why it matters for your brand

For B2B brands, the first citation is not a vanity metric. In ChatGPT, Perplexity, Copilot, and Google's AI Mode, the first source named in an answer is the one that shapes the framing of everything that follows. It is also the source most likely to receive the click, the screenshot, and the secondary share inside Slack threads and procurement decks. If your page is cited third, you are wallpaper.

For financial services brands, this reframes the SEO playbook. A global bank's "Insights" hub built around evergreen thought leadership will lose to a narrow, intent-matched explainer from a smaller competitor or even a regulator. If the prompt is "what is the capital requirement under Basel III endgame for regional US banks," the page that wins is the one whose title, opening paragraph, and structure mirror that exact question. Brand equity is not a tiebreaker the model can see, because brand was anonymized in the study and the effect still held. Topical fit beat it.

For multilaterals and the UN system, the implication is uncomfortable. Reports from UNDRR, the World Bank, or the IMF are written as comprehensive documents covering many sub-questions in a single PDF. RAG retrievers chunk those PDFs, but the chunk that surfaces is rarely the one that maps cleanly to a user's narrow prompt. The fix is not to stop writing flagship reports. It is to publish a parallel layer of single-question web pages, each engineered around one query a policymaker or journalist would actually type. CGAP's microfinance research, for instance, will be cited more often when each finding has its own URL with a question-shaped title than when it sits inside a 60-page PDF.

For major industrial groups, the pricing and timestamp findings matter most. Holcim, Siemens, or Schneider Electric publishing product or sustainability content without explicit figures and visible publication dates is handing the first citation to a competitor or, worse, to a trade publication summarizing their own work. The model rewards specificity it can quote. A page that says "our low-carbon cement reduces embodied CO2 by 30% versus OPC, as of October 2025" outperforms a page that says "significantly reduces emissions."

For philanthropic and policy institutions, list position is the sleeper finding. If your foundation's research consistently appears lower in the retriever's candidate list, you will lose first citations even when your content is better. That means the unglamorous work of structured data, canonical URLs, sitemap hygiene, and being indexed by the specific vector databases that feed Perplexity and ChatGPT search now matters more than another op-ed in a legacy outlet. Distribution into the retrieval layer is the new media buy.

The signal in context

Most GEO research to date has measured outcomes: which domains get cited, how often Reddit appears, how Wikipedia dominates. This study is different because it measures mechanism. It tells you what to change on a page to win a citation against a specific competing page. That is the difference between an audit and an operating manual.

It also confirms what the better SEO teams already suspected: the LLM citation layer rewards a different content shape than Google's blue links did. Ten years of B2B content strategy optimized for topical authority, internal linking, and domain-level trust signals. Those still matter for retrieval. But once two candidates are in the model's context window, the page that wins is the one that reads like a direct answer to the prompt, carries concrete numbers, and shows a recent date. Comms teams that keep producing 2,500-word thought leadership without breaking it into intent-matched, dated, numerically specific child pages will keep losing first citations to competitors who do.

Source: arXiv: generative search engines

AI-authored, editor reviewed