iToverDose/Startups· 1 JUNE 2026 · 20:00

MiniMax M3 challenges big AI models with open access and million-token context

A Chinese AI startup’s new model delivers enterprise-grade performance at a fraction of the cost of Western giants, while offering open-source access and a 1-million-token window. What does this mean for the future of AI development?

VentureBeat3 min read0 Comments

A fresh wave of disruption is rippling through the enterprise AI landscape after Chinese startup MiniMax unveiled its flagship language model, MiniMax-M3, over the weekend. The company claims the new system redefines the cost-performance frontier by combining advanced coding ability, native multimodality, and a million-token context window—all priced at a small fraction of what leading U.S. providers charge.

MiniMax positioned M3 as a direct challenge to proprietary heavyweights like OpenAI, Google, and Anthropic, positioning it as both faster and far more affordable. For the first week after launch, MiniMax is offering a limited-time discount: $0.30 per million input tokens and $1.20 per million output tokens (cached). Even after the discount expires, the model’s full rate—$0.60/$2.40 per million tokens—remains well below competing offerings, typically under 10% of the industry norm.

Beyond competitive pricing, MiniMax announced plans to release the model under an open-weights license within the next 10 days, enabling enterprises to download, customize, and deploy the system without licensing fees. This marks a major shift from the traditional closed-source model that has dominated high-performance AI for years.

Breaking the cost-performance ceiling in enterprise AI

Historically, organizations have faced a stark trade-off when selecting large language models. Closed-source systems from Google, OpenAI, and Anthropic deliver cutting-edge reasoning and coding skills but come with steep price tags and restrictive APIs. Open-source alternatives, in contrast, are budget-friendly but often lag in complex tasks, long-context handling, and multimodal reasoning.

MiniMax-M3 aims to dismantle that dichotomy. By integrating native multimodality from the ground up and introducing a novel attention architecture, the model reportedly achieves performance comparable to top-tier closed systems while operating at a fraction of their cost. According to internal benchmarks, M3 surpasses recent releases from Google and OpenAI on key developer-focused tests, including code generation and multi-step reasoning scenarios.

The pricing disparity is stark. For example, OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro Preview command up to $35 and $22 per million tokens respectively. MiniMax-M3, by contrast, delivers similar capabilities at less than one-tenth the price under its full-rate plan.

How MiniMax keeps costs low with new attention techniques

The engineering breakthrough behind M3’s efficiency is MiniMax Sparse Attention (MSA), a radical departure from traditional Transformer attention mechanisms. While standard attention scales quadratically with input size, MSA uses a sparse, block-based approach that dramatically reduces computational overhead.

Instead of re-reading the entire dataset for each query, MSA acts like an intelligent curator. It partitions the Key-Value (KV) matrix into focused blocks and only retrieves the relevant segments for each query. This selective retrieval ensures memory access remains contiguous, minimizing cache misses and maximizing hardware utilization.

In controlled tests, MSA runs over four times faster than comparable open-source sparse attention methods like Flash-Sparse-Attention. At the maximum context length of 1 million tokens, M3 reduces per-token compute demand to just 5% of its predecessor, accelerating prefill by 9x and decoding by 15x.

These gains translate directly into lower operational costs and faster inference—critical factors for enterprises running large-scale AI workloads.

Built for multimodality from the ground up

Unlike many models that bolt on vision capabilities after pretraining, MiniMax engineered M3 to handle text, images, and visual data natively from day one. The company rebuilt its data pipeline to ingest interleaved sequences of text and visual elements, amassing a pretraining corpus exceeding 100 trillion tokens.

This unified approach enables M3 to interpret complex visual inputs—such as code diagrams, charts, or geographical maps—and translate them directly into structured code or detailed descriptions with high fidelity. In standardized evaluations, the model demonstrates strong performance in tasks requiring cross-modal understanding, setting a new bar for open-weights systems.

What this means for the future of AI development

The launch of MiniMax-M3 signals a potential inflection point in the AI industry. By delivering enterprise-grade intelligence at consumer-level prices—and with open-source access—MiniMax is empowering smaller teams and startups to build advanced AI applications without being locked into expensive proprietary platforms.

As open-weights models close the gap with closed systems in reasoning, coding, and multimodality, the balance of power may shift toward transparency, customization, and cost efficiency. With M3’s open release on the horizon, the coming months could redefine how businesses adopt and scale AI solutions globally.

AI summary

Çinli MiniMax, M3 modeliyle yapay zeka pazarında devrim yaratıyor. GPT-5.5 ve Gemini 3.1’i hem performans hem de maliyet açısından geride bırakan model, açık kaynak stratejisiyle sektörü yeniden şekillendiriyor.

Comments

00
LEAVE A COMMENT
ID #20FI3K

0 / 1200 CHARACTERS

Human check

5 + 6 = ?

Will appear after editor review

Moderation · Spam protection active

No approved comments yet. Be first.