AI & SaaS

Microsoft Launched 7 MAI Models at Build 2026: The Complete Breakdown

June 12, 20267 min readBy SaaS Master

Microsoft did not launch one AI model at Build 2026 on June 2, 2026 — it launched seven. The full MAI model family covers reasoning, coding, image generation, transcription, and voice. Every model in the family was trained from scratch without distillation from OpenAI or any other third-party model. Here is what each one does, what it costs, and what the announcement means for teams choosing where to build their AI stack.

Key takeaways

Microsoft unveiled 7 first-party MAI models at Build 2026, all trained without OpenAI data
MAI-Thinking-1 is a 35B MoE reasoning model that scored 97.0% on AIME 2025 — competitive with o3
MAI-Code-1-Flash costs $0.75 per million input tokens and outperforms Claude Haiku 4.5 on coding benchmarks
All models run on Azure AI Foundry and are optimized for the Maia 200 chip, Microsoft's own AI silicon
The independence from OpenAI is a deliberate enterprise strategy signal, not just a technical footnote

Microsoft MAI models Build 2026 complete breakdown

Why Build 2026 was different

Microsoft has offered AI models for years through Azure OpenAI Service — but those were always third-party models: GPT-4, o3, DALL-E. Build 2026 marks the first time Microsoft released its own first-party model family, trained on its own infrastructure, on its own silicon, with its own data pipeline. The announcement is strategically significant because it gives Microsoft leverage over OpenAI that it did not previously have, and it gives enterprise customers an AI stack that does not depend on the OpenAI relationship continuing on current terms.

Here is a breakdown of each model announced.

MAI-Thinking-1: the reasoning flagship

MAI-Thinking-1 is a 35-billion active-parameter Mixture-of-Experts model designed for extended reasoning tasks — math, science, complex code, and multi-step decision chains. It scored 97.0% on AIME 2025 and 94.5% on AIME 2026, placing it alongside o3 on the hardest publicly available math competition benchmarks. On SWE-bench Pro it landed at 53%, matching the range of leading reasoning models on that benchmark.

The 256K context window is smaller than Claude Opus 4.6's 1M but larger than o3's 200K. It is available in private preview on Azure AI Foundry and through GitHub Models for free-tier prototyping. Production pricing has not been published; Azure Foundry uses pay-as-you-go billing on reasoning tokens consumed.

Microsoft is co-designing it with the Maia 200 chip and reports a 1.4x performance-per-watt gain when running MAI models end-to-end on that silicon. For enterprise teams running AI at scale in Azure data centers, that efficiency gain translates directly to lower compute cost.

MAI-Code-1-Flash: budget coding where Haiku falls short

MAI-Code-1-Flash is a 5-billion active-parameter model built specifically for everyday coding tasks. At $0.75 per million input tokens, it is cheaper than Claude Haiku 4.5. On SWE-bench Pro it scores 51.2% versus Haiku's 35.2% — a 16-point lead on one of the most demanding coding benchmarks available. It also solves harder problems with up to 60% fewer tokens than comparable small coding models.

As of June 2, 2026, MAI-Code-1-Flash is the default model selected under GitHub Copilot's auto router in Visual Studio Code for all paying plans. It is also available on OpenRouter, Fireworks, and Baseten. For teams that have been using Haiku as a fast, cheap code-completion model, this is a direct upgrade at the same or lower price point.

MAI-Image-1: first-party image generation

MAI-Image-1 is Microsoft's own image generation model, replacing the reliance on DALL-E in Microsoft 365 Copilot. The model is integrated directly into Copilot's design and content creation workflows. For enterprise customers, the key distinction is data privacy — images generated through MAI-Image-1 in a Microsoft 365 environment are processed under Microsoft's enterprise data agreements, not OpenAI's. That distinction matters in regulated industries and organizations that have restrictions on sending content to third-party APIs.

MAI-Image-1 is available through Azure AI Foundry.

MAI-Transcribe-1 and MAI-Voice-1: the speech pair

Microsoft's speech pipeline now runs on first-party models instead of Whisper for transcription. MAI-Transcribe-1 handles speech-to-text with lower latency than the Whisper integration it replaces inside Copilot. MAI-Voice-1 is the corresponding text-to-speech model, designed for expressive output with support for custom voice profiles. Together, they form the audio backbone of Microsoft's Copilot voice experiences and are available separately through Azure AI Foundry for developers building their own voice applications.

The Maia 200 angle

Every MAI model is optimized for Microsoft's own AI accelerator chip, the Maia 200. The 1.4x performance-per-watt improvement on MAI-Thinking-1 is the clearest example, but the chip advantage extends across the family. This matters because it means Microsoft's cost structure for running these models is different — and potentially lower — than running equivalent third-party models on Nvidia hardware. Over time, that advantage could be passed through to customers or used to sustain lower pricing than competitors.

What this means for builders and SaaS teams

If you are building AI features inside Microsoft's ecosystem — Copilot, Azure, Teams, or Microsoft 365 — the MAI models make a compelling case to stay there. You get first-party enterprise compliance, a coding model that beats Haiku at a lower price, a reasoning model that matches o3 on math benchmarks, and a full speech pipeline, all under one billing relationship.

If you are building outside Microsoft's ecosystem, the immediate impact is on competition — a cheaper, capable coding model from Microsoft puts pressure on Anthropic and OpenAI's small-model pricing tiers. MAI-Code-1-Flash on OpenRouter is available to anyone, not just Azure customers.

From a creator and tutorial perspective, this is the most interesting AI infrastructure story of 2026 so far. Microsoft trained these models independently, deployed them on its own silicon, and made them available across multiple channels in one announcement. That is a different kind of move than adding a feature to Copilot.

Frequently asked questions

How many models did Microsoft announce at Build 2026?

Microsoft announced seven new first-party MAI models at Build 2026 on June 2, 2026. The family includes MAI-Thinking-1 for reasoning, MAI-Code-1-Flash for coding, MAI-Image-1 for image generation, MAI-Transcribe-1 for speech-to-text, and MAI-Voice-1 for text-to-speech, plus additional models in preview.

Were the MAI models trained using OpenAI data?

No. Microsoft explicitly states that every MAI model was trained from scratch on clean, commercially licensed data without distillation from OpenAI or any other third-party model. This is a deliberate enterprise positioning choice that gives customers confidence about data provenance.

What is MAI-Code-1-Flash pricing?

MAI-Code-1-Flash is priced at $0.75 per million input tokens on Azure AI Foundry and is cheaper than Claude Haiku 4.5. It is available to all paying GitHub Copilot users as a default model in VS Code, and on third-party providers including OpenRouter, Fireworks, and Baseten.

Microsoft MAI models Build 2026 AI tools enterprise AI

Was this article helpful?

SaaS Master

Creator behind SaaS Master — tutorials, walkthroughs, reviews, and explainers that help SaaS, AI, and WordPress products get understood and chosen. Writing here about the tools, trends, and tactics that actually move the needle. Work with me →