Google unveils Gemini Omni and a 24/7 cloud agent at I/O
Omni powers AI Overviews and Spark runs continuously in the cloud. Video, audio, and structured data are now primary citation assets.
Key takeaways
- Gemini Omni is now the default model behind Google's AI Overviews and AI Mode.
- Multimodal retrieval means video, audio, and image assets become primary citation sources, not secondary SEO.
- Gemini Spark introduces a persistent cloud agent that evaluates brands continuously, not per-query.
- Industrial and financial brands need structured, machine-readable disclosures or they lose to competitors who have them.
- First-mover advantage in agent trust memory is real and underpriced for policy institutions and multilaterals.
What happened
Per The Decoder, Google used its I/O 2026 developer conference to ship three things that matter for anyone tracking LLM visibility: Gemini 3.5 Flash (a faster mid-tier model), Gemini Omni (a fully multimodal model handling text, image, audio, and video in one pass), and Gemini Spark, a personal agent that runs continuously in the cloud rather than only when a user opens a chat window.
The Gemini app also got a redesign built around Spark. The agent persists. It works while the user sleeps, monitors inboxes, drafts replies, runs research tasks, and surfaces results when the user returns. The Decoder frames Spark as Google's answer to OpenAI's operator-style agents and Anthropic's Claude computer-use mode, but with deeper integration into Workspace, Search, and Android.
The numbers Google emphasised at the keynote: Gemini 3.5 Flash runs roughly 40% faster than the 2.5 generation at lower inference cost, and Omni is positioned as the default model behind AI Overviews and AI Mode in Search. That last detail is the one to underline.
Why it matters for your brand
If Omni becomes the model powering AI Overviews and AI Mode, every B2B brand competing for visibility in Google's answer surfaces is now being read by a multimodal system. That changes what counts as a citable asset. A chart embedded as an image, a webinar recording, a podcast transcript, a product demo video: these stop being secondary SEO assets and start becoming primary inputs the model can quote from directly. Brands that have treated video and audio as awareness plays, with no structured transcripts or schema, are about to discover those assets are invisible to the layer that matters most.
For financial services brands, the Spark agent is the bigger shift. A persistent cloud agent that monitors news, filings, and research on behalf of a user means the buyer journey for a corporate banking or asset management decision now includes an always-on intermediary. The agent will summarise your latest report whether or not the prospect ever visits your site. If your IR page, your thought leadership PDFs, and your analyst commentary are not structured for machine retrieval, a competitor's are, and Spark will surface theirs. JPMorgan, BlackRock, and the Big Four already publish with this in mind. Most mid-tier financial brands do not.
For multilaterals and UN-system communicators, Omni's multimodal default rewires how policy content gets surfaced. A UNDRR briefing video, a CGAP webinar, a World Bank infographic: all of these become directly quotable in a Gemini answer about climate finance or financial inclusion, provided the model can ingest them cleanly. That means captions, transcripts, alt text, and clear on-screen attribution stop being accessibility checkboxes and become citation infrastructure. The institutions that invest in this win the policy framing in AI answers. The ones that do not will watch think tanks and consultancies take the citation slot.
For major industrial groups (Holcim, Siemens, ArcelorMittal, the cement and steel and energy majors), Spark introduces a new procurement risk. A buyer's agent runs in the background, evaluates supplier sustainability reports, ESG disclosures, and product specs over weeks, and produces a shortlist. If your sustainability data is locked in a PDF that the agent has to parse against a competitor's structured data feed, you lose before the human ever looks. The content strategy implication: shift from annual report drops to continuous structured publishing.
For philanthropic and policy institutions, the redesigned Gemini app is the channel to watch. When a foundation programme officer or a policy researcher uses Spark to track a field over months, the institutions whose content the agent learns to trust early become the default sources. First-mover advantage in agent memory is real and underpriced.
The signal in context
Google has now joined OpenAI and Anthropic in shipping a persistent agent rather than a session-based chatbot. That is the meaningful trend, not the model version numbers. The shift from "user opens chat, asks question, closes tab" to "agent runs continuously, returns with synthesis" changes the unit of competition for brands. You are no longer optimising for a single query. You are optimising for repeated, longitudinal evaluation by a system that builds a model of which sources it trusts.
Omni's multimodal default in Search is the second signal. Through 2025, AI Overviews drew primarily from text. A multimodal retrieval layer means the citation pool widens to include video, audio, and image-native publishers. Brands with strong YouTube, podcast, or visual-explainer libraries gain ground. Brands that publish only blog posts and PDFs lose relative share. The publishing mix that worked for SEO in 2022 is not the publishing mix that wins citations in Gemini in 2026.