AI & SaaS
Microsoft Built Its Own AI to Replace OpenAI: The MAI Models Explained (Build 2026)

Microsoft built seven AI models from scratch without using any OpenAI technology, and two of them just beat the OpenAI and Anthropic equivalents they were designed to replace.
At Build 2026 on June 2, Microsoft announced a family of models developed entirely in-house under the Microsoft AI (MAI) brand. This is the first time Microsoft has shipped frontier models it built independently — everything before this, from Azure OpenAI to Copilot, ran on OpenAI models. The strategic motivation is obvious: reduce dependence on a supplier that is also a competitor and drive down inference costs. But the technical result is more interesting than the strategy. MAI-Code-1-Flash beats Claude Haiku 4.5 by 16 points on SWE-Bench Pro. MAI-Thinking-1 scored 97% on AIME 2025 and was preferred over Claude Sonnet 4.6 in blind evaluations. After fine-tuning for enterprise clients like McKinsey, MAI-Thinking-1 outperformed GPT-5.5 at 10x lower cost efficiency. These are not internal benchmarks — they involve independent evaluators. Microsoft built real models.
Key takeaways
- Microsoft announced seven MAI models at Build 2026 on June 2, 2026, built entirely without OpenAI technology — the first genuine in-house frontier models from Microsoft.
- MAI-Code-1-Flash is an agentic coding model that outperforms Claude Haiku 4.5 by 16 points on SWE-Bench Pro (51.2% vs 35.2%) and solves harder coding problems with 60% fewer tokens. It is priced at $0.75 per million input tokens.
- MAI-Thinking-1 scores 97.0% on AIME 2025 and 94.5% on AIME 2026, was preferred over Claude Sonnet 4.6 in independent blind evaluations, and outperformed GPT-5.5 at 10x lower cost efficiency after enterprise fine-tuning.
- MAI-Code-1-Flash is native to GitHub Copilot and VS Code, with deep integration into the Microsoft developer stack — it is not just available via API but designed to run inside the tools developers already use.
- For businesses already on Microsoft 365 and Azure, the MAI models represent a cost reduction opportunity in AI-powered workflows without switching tools.
Why Microsoft Built Its Own Models
Microsoft's AI journey until this point was essentially a proxy relationship. Azure OpenAI gave enterprise customers access to GPT models through Microsoft's infrastructure. GitHub Copilot ran on OpenAI code models. Copilot in Microsoft 365 ran on GPT-4 and later GPT-5 variants. The cost of that arrangement accrued at scale — for every completion, reasoning call, or agent action across Microsoft's enormous enterprise customer base, a percentage went to OpenAI.
That arrangement also created strategic risk. OpenAI and Microsoft are partners, but OpenAI is also building products — ChatGPT Enterprise, the Operator API — that compete with Microsoft's own offerings. Owning the underlying models is the obvious answer, but building frontier-capable models is genuinely hard. The MAI announcement at Build 2026 suggests Microsoft either solved that problem or got close enough that it no longer matters.
The decision to train entirely without OpenAI technology is notable because it resolves a specific risk: dependency on a relationship that could change. Microsoft's announcement described the training data as "clean, traceable, and enterprise-grade, without distillation from third-party models" — a pointed statement about training provenance that has legal implications as well as technical ones.
MAI-Code-1-Flash: The Copilot Model
MAI-Code-1-Flash is built specifically for agentic coding tasks within the GitHub Copilot and VS Code ecosystem. With 5 billion active parameters, it sits in the same tier as Haiku-class models — fast, efficient, designed for high-volume inference — but posts significantly stronger results.

On SWE-Bench Pro, MAI-Code-1-Flash scores 51.2% versus Claude Haiku 4.5's 35.2% — a 16-point gap that is meaningful rather than marginal. The model also includes an adaptive solution length control that adjusts reasoning depth based on task complexity: simple requests get concise output, harder problems get more reasoning budget. The claimed result is 60% fewer tokens on SWE-Bench Verified tasks compared to baseline, which translates directly to lower inference cost per task.
Pricing is $0.75 per million input tokens. For comparison, Claude Haiku 4.5-20251001 is priced at $0.80 per million input tokens on Anthropic's API. MAI-Code-1-Flash is slightly cheaper and posts significantly better coding benchmarks — which is the entire value proposition in a sentence.
In the GitHub Copilot token-based billing model, MAI-Code-1-Flash is priced below Claude Haiku 4.5, making it the default cost-efficient choice for individual users and teams already on Copilot. It is available in VS Code's model picker now and rolling out to the auto-picker as the default selection.
MAI-Thinking-1: The Reasoning Model
MAI-Thinking-1 is Microsoft's frontier reasoning model, designed for complex multi-step problems where accuracy matters more than speed. The benchmark numbers are strong: 97.0% on AIME 2025 and 94.5% on AIME 2026 — the latter being a harder set of math and scientific reasoning problems published this year.
The more practically interesting figure is the enterprise fine-tuning result. Microsoft demonstrated that after fine-tuning MAI-Thinking-1 for McKinsey, the model outperformed GPT-5.5 with what Microsoft described as 10x better cost efficiency. This is a significant claim. It suggests that for domain-specific enterprise use cases, a fine-tuned MAI-Thinking-1 can deliver GPT-5.5-level output at a fraction of GPT-5.5's cost, which runs around $5 per million input tokens.
Independent evaluator Surge ran blind side-by-side comparisons between MAI-Thinking-1 and Claude Sonnet 4.6. MAI-Thinking-1 was preferred in those evaluations. Microsoft is careful not to claim MAI-Thinking-1 beats GPT-5.5 across the board — the position is that for targeted enterprise applications, fine-tuning closes that gap dramatically.
What This Means for SaaS Teams on Azure
If your product runs on Azure, the MAI models represent a cost reduction lever that does not require any migration. MAI-Code-1-Flash is available through Azure AI Foundry and natively integrated into GitHub Copilot. MAI-Thinking-1 is available via Azure AI Foundry for fine-tuning and inference. The pricing dynamics favor organizations running high volumes of AI completions — every token saved at the Copilot level accumulates across a developer organization.
For SaaS teams that build on top of Azure OpenAI, the MAI models introduce a new calculation: for agentic coding tasks, MAI-Code-1-Flash is now cheaper and potentially better than the Haiku-equivalent model they might be using. The integration path is Azure AI Foundry, and the API surface is familiar.
The broader strategic implication is that Microsoft's enterprise AI stack is becoming more self-contained. Copilot in 365, GitHub Copilot, Azure AI, and Bing all run on the same platform — and that platform now has a frontier model tier that Microsoft developed independently. That changes the supplier dynamic for enterprise AI buyers.
Frequently asked questions
Does Microsoft still use OpenAI models in Copilot? For existing Copilot products, Microsoft continues to use OpenAI models where MAI models are not yet deployed. MAI-Code-1-Flash is now available in GitHub Copilot and rolling out as a default option. MAI-Thinking-1 is available for fine-tuning through Azure AI Foundry. The transition from full OpenAI dependency to a mixed stack is underway but not complete as of Build 2026.
How does MAI-Code-1-Flash compare to Claude Haiku 4.5 on price? MAI-Code-1-Flash is priced at $0.75 per million input tokens on the Azure API. Claude Haiku 4.5 is priced at approximately $0.80 per million input tokens. In GitHub Copilot's token-based billing, MAI-Code-1-Flash is priced below Haiku 4.5. The benchmark gap — 51.2% vs 35.2% on SWE-Bench Pro — is significant for the price difference.
What is Microsoft AI Foundry? Azure AI Foundry is Microsoft's unified platform for fine-tuning, deploying, and running AI models on Azure infrastructure. It is the primary access point for the MAI model family. Organizations can use it to fine-tune MAI-Thinking-1 for specific domains, access MAI-Code-1-Flash via API, and manage inference costs across their AI workloads.
Was this article helpful?
SaaS Master
Creator behind SaaS Master — tutorials, walkthroughs, reviews, and explainers that help SaaS, AI, and WordPress products get understood and chosen. Writing here about the tools, trends, and tactics that actually move the needle. Work with me →
Want your product explained this clearly — in video?
Tutorials, walkthroughs, reviews, and shorts for SaaS, AI, and WordPress products.
Work With SaaS Master
