Xiaomi’s new open-source AI models slash agent costs with 310B-parameter power

Xiaomi is making waves in the AI landscape with its latest open-source large language models, MiMo-V2.5 and MiMo-V2.5-Pro, which deliver cutting-edge performance while keeping costs surprisingly low. These models join Xiaomi’s growing portfolio of enterprise-friendly AI tools, available under the MIT License for unrestricted use in commercial and personal projects.

Developers can now access both versions directly from Hugging Face, where they can be downloaded, customized, and deployed locally or on cloud infrastructure. The standout feature of these models is their efficiency in agentic "claw" tasks, where AI agents autonomously complete user-requested actions—such as generating marketing content, managing accounts, or organizing schedules—via third-party messaging platforms.

Benchmarks reveal superior efficiency in agent tasks

According to Xiaomi’s published benchmarks, MiMo-V2.5 and MiMo-V2.5-Pro rank among the most efficient open-source models for agent-driven workflows. The ClawEval benchmark chart places them near the top left, indicating high task success rates with minimal token consumption—a critical advantage as usage-based billing models like GitHub Copilot gain traction.

The Pro version leads the open-source field with a 63.8% success rate, requiring only about 70,000 tokens per trajectory. This represents a 40–60% reduction in token usage compared to proprietary models like Anthropic’s Claude Opus 4.6, Google’s Gemini 3.1 Pro, and OpenAI’s GPT-5.4, which demand significantly more tokens to achieve comparable results. For enterprises sensitive to operational costs, this efficiency translates into substantial savings, especially when scaling agentic workflows.

Dual models tailored for multimodal and agentic workloads

Xiaomi has engineered two distinct versions of MiMo-V2.5 to address different use cases. The base model, MiMo-V2.5, is optimized for multimodal interactions, while the Pro variant specializes in long-horizon agentic tasks and complex software engineering challenges.

On the GDPVal-AA (Elo) benchmark, MiMo-V2.5-Pro achieved a score of 1581, outperforming competitors such as Kimi K2.6 and GLM 5.1. Xiaomi’s research highlights the Pro model’s ability to maintain coherence over extended tool interactions, a feature the company describes as "harness awareness"—where the model actively manages its own memory and context to sustain performance during prolonged sequences of operations.

Real-world demonstrations of autonomous capability

Xiaomi has provided detailed examples of MiMo-V2.5-Pro’s autonomous performance in high-complexity tasks:

Rust-based SysY Compiler: The model developed a fully functional compiler, including lexer, parser, and RISC-V assembly backend, in just 4.3 hours. It executed 672 tool calls and achieved a perfect score of 233/233 on hidden test suites—a task typically requiring weeks for a computer science student.

Desktop Video Editor: Over 11.5 hours and 1,868 tool calls, the model generated an 8,192-line application with multi-track timelines and a robust export pipeline.

Analog EDA Optimization: In a graduate-level engineering challenge, the model optimized a Flipped-Voltage-Follower (FVF-LDO) regulator in the TSMC 180nm process. By iterating through an ngspice simulation loop, it improved line regulation metrics by 22 times compared to its initial attempt.

These demonstrations underscore the model’s ability to autonomously navigate and solve intricate, multi-step problems with minimal human intervention.

Competitive pricing disrupts the AI market

Xiaomi is positioning MiMo-V2.5 and MiMo-V2.5-Pro as cost-effective alternatives to established proprietary models. The pricing structure is designed to appeal to both domestic and international developers, with rates varying based on context window size and token cache efficiency.

For overseas developers, MiMo-V2.5-Pro is priced at $1.00 per million input tokens (for cache misses) and $3.00 per million output tokens within context windows up to 256K. For ultra-long contexts between 256K and 1M tokens, costs double to $2.00 for input and $6.00 for output, though Xiaomi’s caching mechanisms can reduce input costs to as low as $0.20–$0.40 per million tokens on cache hits. The base model starts at $0.40 per million input tokens and $2.00 per million output tokens overseas.

Domestically in China, the Pro model begins at ¥7.00 per million input tokens for standard contexts, scaling to ¥14.00 for the extended 1M range. The base model’s domestic pricing starts at ¥3.50 per million input tokens and ¥10.00 per million for output.

A shift toward open, efficient AI deployment

With MiMo-V2.5 and MiMo-V2.5-Pro, Xiaomi is challenging the dominance of closed-source models from tech giants like Google and OpenAI, particularly in agentic workloads where efficiency and cost are paramount. By combining a massive 310B-parameter architecture with a native 1M-token context window and aggressive token optimization, the models offer a compelling value proposition for developers seeking high performance without the premium pricing of proprietary alternatives.

As enterprises increasingly adopt AI agents for automation, Xiaomi’s latest releases could redefine industry standards for both capability and affordability in the open-source AI ecosystem.

Future updates may expand the models’ tool integrations and refine their caching mechanisms, further enhancing their efficiency and versatility in production environments.

AI summary

Xiaomi'nin yeni MiMo-V2.5 ve V2.5-Pro modelleri, ajan tabanlı görevlerde rakiplerinden %40-60 daha az token kullanıyor. MIT lisansıyla sunulan bu açık kaynaklı modellerin fiyatları ve performansı hakkında detaylar.

Xiaomi’s new open-source AI models slash agent costs with 310B-parameter power

Benchmarks reveal superior efficiency in agent tasks

Dual models tailored for multimodal and agentic workloads

Real-world demonstrations of autonomous capability

Competitive pricing disrupts the AI market

A shift toward open, efficient AI deployment

Comments

Netomi secures $110M to redefine enterprise AI for customer service

AWS integrates OpenAI models—why the cloud AI landscape just flipped

Hybrid retrieval overtakes pure vector RAG as enterprises seek scalable AI accuracy