
OpenAI’s new open models are now available through Microsoft’s Azure AI Foundry and Windows AI Foundry. This means developers can now build, test, and deploy applications powered by gpt-oss-20b and gpt-oss-120b within the artificial intelligence development platforms.
Sam Altman’s AI startup launched its first “open weight” models since 2019 on Tuesday. The smaller of the two, gpt-oss-20b, has 21 billion parameters and is similar in performance to o3-mini. It only needs 16GB of VRAM, so it can be run on a range of Windows hardware, including GPUs with sufficient capacity.
The larger, gpt-oss-120b, has 117 billion parameters, can run on a single NVIDIA GPU, and is comparable in performance to o4-mini. Offered as a serverless option on Azure cloud infrastructure, pricing for this model starts at $0.15 per million input tokens and $0.60 per million output tokens.
The gpt-oss-20b model uses managed compute, and its pricing depends on the Azure Machine Learning virtual machine type selected, rather than tokens processed.
In addition to Azure AI Foundry, gpt‑oss‑20b is available locally through Windows AI Foundry for Windows 11, with macOS support coming soon through Foundry Local. Both models are also available on more than a dozen other third-party platforms, including Hugging Face, Cloudflare, and AWS.
What the new AI models can do, and how they do it
OpenAI says the gpt-oss models are able to complete advanced reasoning tasks, write code, search the web, and build autonomous agents that act on the user’s behalf; however, they have been trained on a text-only dataset, so they won’t be able to process or generate images, audio, or other media types.
Both OpenAI models leverage a “mixture of experts” architecture to improve model efficiency, with gpt-oss-120b only activating 5.1 billion parameters per token. They were trained using reinforcement learning, where they learn by interacting with simulated environments and receiving feedback in the form of rewards, to improve their future decisions.
Developers are free to customise the open models on Azure AI Foundry
The gpt-oss models have been licensed under Apache 2.0, allowing for free use, modification, and redistribution, including for commercial purposes. Microsoft says Azure AI Foundry offers tools for fast model training, weight management, and low-latency deployment. Custom versions, or checkpoints, can be shipped in a matter of hours rather than weeks.
Azure AI Foundry’s tools allow developers to customise the gpt-oss models in a variety of ways, including splicing in proprietary data, optimising them for edge devices, and exporting them for deployment in containerised environments such as Azure Kubernetes Service or Foundry Local.
“Whether you’re adapting for a domain-specific copilot, compressing for offline inference, or prototyping locally before scaling in production, Azure AI Foundry and Foundry Local give you the tooling to do it all,” Microsoft said in its press release.
Both models will soon be made compatible with OpenAI’s Responses API, which lets developers swap them into existing apps with minimal code changes.
Microsoft is collecting AI partnerships for its Azure AI Foundry
In May, Microsoft added Elon Musk’s Grok models to Azure AI Foundry, a controversial move given its close partnership with OpenAI and Musk’s ongoing feud with the company.
It has also struck non-exclusive licensing deals with Inflection AI and Mistral AI, and Azure AI Foundry features thousands of foundation models from the likes of Meta, DeepSeek, NVIDIA, Cohere, and Hugging Face. The broader the model selection, the more attractive the platform becomes to businesses and developers.
Tensions between OpenAI and Microsoft have escalated recently as the two companies renegotiate the terms of their long-standing partnership.