
OpenAI launches two new open-weight AI models: gpt-oss-120b, gpt-oss-20b
The new models are optimised for efficient deployment and outperform similarly sized open models on reasoning tasks, demonstrating strong tool use capabilities
OpenAI has released gpt-oss-120b and gpt-oss-20b — two open-weight language models that promise to deliver strong real-world performance at low cost.
These models are said to outperform similarly-sized open models on reasoning tasks, demonstrate strong tool use capabilities, and are reportedly optimised for efficient deployment on consumer hardware.
Also Read: OpenAI unveils new AI agent for ChatGPT
OpenAI's two new gpts
According to OpenAI, these models were trained using a mix of reinforcement learning and techniques informed by the company's most advanced internal models, including o3 and other frontier systems.
The gpt-oss-120b model achieves near-parity with OpenAI o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU.
The gpt-oss-20b model delivers similar results to OpenAI o3‑mini on common benchmarks and can run on edge devices with just 16 GB of memory, making it ideal for on-device use cases, local inference, or rapid iteration without costly infrastructure.
These models are designed to be used within agentic workflows with exceptional instruction following, tool use like web search or Python code execution, and reasoning capabilities — including the ability to adjust the reasoning effort for tasks that don’t require complex reasoning and/or target very low latency final outputs.
They are entirely customisable, provide full chain-of-thought (CoT), and support structured outputs.
Also Read: IIT professor highlights BharatGen's challenges in building LLMs for 16 Indian languages
Partnering with industry leaders
OpenAI says that gpt-oss-120b and gpt-oss-20b have been designed to be flexible and easy to run anywhere—locally, on-device, or through third-party inference providers.
To support this, the tech gaint has reportedly partnered with leading deployment platforms such as Azure, Hugging Face, vLLM, Ollama, llama.cpp, LM Studio, AWS, Fireworks, Together AI, Baseten, Databricks, Vercel, Cloudflare, and OpenRouter to make the models broadly accessible to developers.
On the hardware side, thay have worked with NVIDIA, AMD, Cerebras, and Groq to ensure optimised performance across a range of systems.
OpenAI also put these models through some serious safety testing, using deliberative alignment and the instruction hierarchy to teach the model to refuse unsafe prompts and defend against prompt injections.
Also Read: India's AI market to triple to US$ 17 billion by 2027: Report
Fully customisable
Powered by ONNX Runtime, these models support local inference and are available through Foundry Local and the AI Toolkit for VS Code, making it easier for Windows developers to build with open models.
For developers who want fully customisable models they can fine-tune and deploy in their own environments, gpt-oss is a great fit.
For those seeking multimodal support, built-in tools, and seamless integration with our platform, models available through our API platform remain the best option.
Available under the Apache 2.0 license, developers can modify them however they want.