LLMs are citing synthetic sources in answers
When the citation pool includes AI-generated pages, getting cited is no longer the KPI. The company you keep in the source list is.
Key takeaways
- ChatGPT, Copilot, Gemini, and Perplexity all cite AI-generated sources alongside authoritative ones.
- The arXiv audit covered 712 real-world queries across politically and publicly sensitive domains.
- Brand risk now includes citation adjacency: who appears next to you in the source list.
- Machine-readable provenance (structured data, clear authorship, canonical URLs) is becoming a competitive asset.
- Expect engines to add provenance filters; brands that prepare now will survive the cut.
What happened
Per the arXiv paper "Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources," four major generative search engines (ChatGPT, Copilot, Gemini, and Perplexity) are citing AI-generated web pages as authoritative sources in their answers. The audit ran 712 real-world human-generated queries across domains of public importance, including politics, and found that none of the four engines reliably filter out synthetic content from their citation sets.
The implication is direct. When a CFO asks Copilot about counterparty risk, or a UN policy officer asks Perplexity about a multilateral funding mechanism, the source chain underneath the answer may include pages written by another LLM. The model is treating machine output as evidence on par with primary reporting, regulator filings, or peer-reviewed work.
The arXiv authors frame this as a user-harm problem. We read it as a brand-authority problem too. The pages winning citations are not the pages with the strongest provenance. They are the pages that look retrievable.
Why it matters for your brand
If generative engines cannot distinguish synthetic sources from authoritative ones, the floor of the citation pool has dropped. Your competition for an LLM citation is no longer the FT, Reuters, the World Bank blog, and a handful of trade publications. It is also a long tail of AI-spun pages optimized for retrieval, with no editor, no byline, and no institutional accountability. That changes the economics of earned visibility.
For financial services, this is a compliance-adjacent problem. When ChatGPT answers a question about a structured product or a sanctions regime by blending a Bloomberg citation with a synthetic SEO page, the bank named in the answer inherits the credibility risk. Communications teams at JPMorgan, HSBC, and the major asset managers should be auditing which sources appear alongside their own brand in LLM answers, not just whether they are cited. Adjacency is reputation now.
For multilaterals and the UN system, the stakes are higher because the topics (climate finance, disaster risk, development aid) attract exactly the kind of low-cost synthetic content the audit flagged. If a query on Sendai Framework progress returns a Gemini answer that cites UNDRR alongside an AI-generated explainer with subtly wrong figures, the institution loses control of its own evidence base. The defensive move is to flood the retrieval layer with structured, machine-readable primary data so that when the model picks sources, the official one is the easiest to ingest.
For industrial groups, the risk is product and standards misinformation. A purchasing manager researching cement specifications, ISO conformance, or supply chain emissions methodology is now one prompt away from an answer built partly on synthetic content. Holcim, Siemens, and the standards bodies (ISO, IEEE) need to treat technical documentation as a distribution channel, not an archive. If the canonical specification is gated behind a PDF login, an AI-generated summary of it will win the citation.
For philanthropic and policy institutions, the brand-building model shifts. Thought leadership built on long-form PDFs and gated reports does not compete well against synthetic content optimized for crawlability. The Gates Foundation, Rockefeller, and Open Society have always relied on the assumption that quality wins attention. In the generative search layer, retrievability wins first, and quality is evaluated only if the page makes it into the candidate set.
The strategic conclusion: provenance is now a content asset. Brands that publish with clear authorship, structured metadata, verifiable data, and a consistent institutional URL pattern will be easier for the next generation of engines to prefer once they start filtering synthetic sources. Brands that publish in formats indistinguishable from AI output will be filtered alongside the spam.
The signal in context
This audit lands in the middle of a broader argument about whether generative search engines are converging on a quality floor or sinking below one. Earlier work on citation accuracy in Perplexity and Bing Chat found high rates of unsupported claims and misattributed sources. The new finding extends that concern one layer deeper: even when a citation is technically correct (the page exists and contains the claim), the page itself may have been written by another model, creating a closed loop of AI citing AI.
For senior marketers, the practical reading is that "getting cited" is no longer a sufficient KPI. The composition of the citation set matters. A brand cited next to three synthetic pages is not winning the same battle as a brand cited next to Reuters and the OECD. Expect the engines to respond, eventually, with provenance signals (C2PA, source-quality scoring, publisher allowlists). The brands that prepare for that filter now, by making their own provenance machine-legible, will be the ones still cited after it ships.