100K prompts reveal a three-tier brand ladder in AI search
LLM search has not democratised discovery. A new large-scale study shows it has entrenched existing brand hierarchies, with precise and uncomfortable numbers.
Key takeaways
- Global household brands appear in 73% of relevant AI-generated answers; most other brands appear at materially lower rates.
- The gap is structural: LLMs weight cross-source citation frequency, not content volume or quality on owned channels.
- For B2B brands outside the top tier, the path to LLM visibility runs through third-party earned coverage in model-trusted outlets.
- Multilateral institutions risk losing citation share to simpler, better-indexed sources even when their authority is recognised.
- Visibility, accuracy, and framing are three separate problems; appearing frequently in LLM answers does not guarantee correct representation.
Global brands appear in 73% of AI-generated answers to relevant prompts. Everyone else is fighting over what remains.
That figure, drawn from a study published on arXiv covering 100,000-plus prompt responses across more than 100 brands tracked on the Ranqo platform between March and May 2026, is the sharpest evidence yet that LLM search has not democratised discovery. It has reinforced the existing hierarchy, then sharpened it.
The research maps what it calls a three-tier brand-stature ladder. Household names such as Stripe and Nike occupy the top rung, surfacing in nearly three-quarters of prompts for which they are genuinely relevant. The middle tier, established but not globally dominant brands, appears at materially lower rates. SMEs, direct-to-consumer brands, and early-stage startups form the base: present in LLM answers at rates the paper treats as the hard case precisely because no amount of content volume compensates for the stature gap.
The mechanism is not mystery
LLMs do not run a fresh web crawl each time a user asks a question. They retrieve and weight content based on signals baked in during training and retrieval augmentation: source authority, citation frequency across the open web, and corroboration across multiple trusted outlets. A brand that features regularly in Reuters, specialist trade press, and institutional reports arrives in the model's context already validated. A brand that exists primarily on its own domain, however well-written, does not.
This is why the 73% figure is a structural finding, not a campaign-cycle fluctuation. Stripe and Nike do not appear in three-quarters of relevant answers because they publish more blog posts. They appear because third-party sources consistently reference them, and the models have learned to treat that cross-source consensus as a proxy for reliability.
For a B2B buyer at a major industrial group or a policy officer at a multilateral institution, this has a direct procurement implication. When a staff member asks an AI assistant to list vendors for a given service category, or to summarise which organisations lead on a given policy issue, the answer is pre-filtered by stature. A mid-tier industrial supplier or a well-regarded but narrowly-known NGO may produce better work than its globally recognised competitors. The model will not say so, because it has little to say at all.
Where the three tiers break down
The tier structure is not immutable. The arXiv paper's framing of SMEs and early-stage startups as the "hard case" implies a solvable problem, not a permanent exclusion. The mechanism that blocks them is the same mechanism that would admit them: consistent citation by sources the models already trust.
For financial services firms outside the global top tier, the implication is specific. A regional bank or a specialist asset manager is unlikely to reach 73% citation rates through content optimisation alone. The path runs through earned presence: coverage in Bloomberg, the Financial Times, or sector-specific outlets with demonstrable training-data weight; white papers cited by think tanks; regulatory filings referenced in academic literature. These are not new communications activities. They are, however, activities that must now be evaluated against a different metric: not traffic or share of voice in human search, but citation frequency in AI-generated answers.