Industry report·Model changes·2 June 2026·3 min read

Microsoft's MAI models challenge OpenAI dependency

Two in-house models, a sparsity bet, and the start of Microsoft's renegotiation with OpenAI. Brand visibility just got more fragmented.

35B

MAI-Thinking-1 active parameters

Microsoft announcement, June 2026

Key takeaways

Microsoft launched two in-house frontier models, MAI-Thinking-1 and MAI-Code-1-Flash, reducing its reliance on OpenAI.
Both use mixture-of-experts designs with low active-parameter counts, optimising for inference cost at Copilot scale.
Copilot and VS Code queries will increasingly route to Microsoft's own weights, not GPT-4.
Brands optimised for ChatGPT visibility may not surface in MAI-backed Microsoft surfaces; training corpora differ.
AI visibility is now a portfolio problem across at least four major model families.

Microsoft has spent roughly $14bn building out OpenAI as its house intelligence. On Monday it quietly began replacing it. The company announced two in-house large language models, MAI-Thinking-1 and MAI-Code-1-Flash, the first serious signal that Redmond intends to run on its own silicon and its own weights.

Simon Willison's Weblog flagged the launch, noting MAI-Thinking-1 is a 1-trillion-parameter reasoning model with just 35bn active parameters, available to "select early partners". MAI-Code-1-Flash, at 137bn parameters with 5bn active, is being pushed straight into GitHub Copilot and Visual Studio Code. Microsoft claims MAI-Thinking-1 beats Anthropic's Claude Sonnet 4.6 in blind human side-by-sides. Treat that claim with the scepticism vendor benchmarks deserve, but the architectural choice is the real story.

The sparsity bet

Both models are mixture-of-experts designs with strikingly low active-parameter counts. A 35bn active model costs a fraction of a dense frontier system to serve, and a 5bn active coding model is cheap enough to run at Copilot's scale without the gross-margin pain that has dogged GitHub's flagship product since its launch. Microsoft is not trying to out-scale OpenAI. It is trying to out-economise it.

That matters because the unit economics of generative AI are still upside down. OpenAI loses money on inference at consumer prices. Microsoft, which pays OpenAI for API access and then resells the output through Copilot, has been absorbing that gap. Owning the model collapses the stack. It also gives Microsoft something it has lacked since 2019: leverage in its next negotiation with Sam Altman.

What this changes for brands trying to be cited

The practical question for marketers is which model answers the query when someone asks Copilot, Bing's chat, or a Microsoft 365 assistant about your sector. Until now, the honest answer was "some version of GPT-4". Soon it will be "depends on the surface, the task, and the cost ceiling". Coding queries in VS Code will route to MAI-Code-1-Flash. Reasoning-heavy enterprise queries may route to MAI-Thinking-1. Consumer chat may still hit OpenAI. Each model has its own training cut-off, its own retrieval behaviour, and its own citation preferences.

For financial-services and industrial brands that have spent the last eighteen months optimising for visibility in ChatGPT, this fragments the target. A page that surfaces cleanly in GPT-4o may be invisible to MAI-Thinking-1, which was trained on a different corpus with different weightings for authoritative sources. Multilaterals and policy institutions, whose content tends to be cited heavily by models that index .org and .int domains, should expect variance: Microsoft has not disclosed its training data, and early in-house models from hyperscalers have historically over-indexed on the company's own properties (LinkedIn, GitHub, MSN) before broadening out.

The defensible move is to stop treating "AI visibility" as a single channel. Brands serious about citation share will need to test prompts across at least four surfaces (ChatGPT, Gemini, Claude, and now Copilot's MAI-backed responses) and accept that the answer will diverge. The era of one model to rule them all, if it ever existed, ended this week.

The dependency unwind

Microsoft's strategic anxiety has been visible for a year. It hired Mustafa Suleyman from Inflection in March 2024 along with most of his team, paid roughly $650m for the privilege, and gave him the mandate to build exactly what shipped on Monday. The Suleyman unit, Microsoft AI, now has seven models in its MAI family. The OpenAI partnership officially runs to 2030, but the renegotiation has already begun, and every model Microsoft ships in-house weakens OpenAI's hand.

For the broader market, the implication is that the frontier is getting more crowded, not less. A year ago the serious contenders were OpenAI, Anthropic, Google and Meta. Now add Microsoft proper, xAI, DeepSeek and Alibaba's Qwen. Each trains on a different mixture. Each cites differently. Brand visibility in LLM answers is becoming a portfolio problem, not a search-engine-optimisation problem, and the institutions that grasp the distinction first will compound the advantage.

The cheaper question Microsoft has just answered: it would rather compete with its largest investee than keep paying rent.

Source: Simon Willison's Weblog

AI-authored, editor reviewed