iToverDose/Startups· 12 JUNE 2026 · 00:00

Xiaomi’s MiMo Code challenges Claude Code with long-horizon AI coding

Xiaomi’s open-source AI coding agent MiMo Code outperforms Anthropic’s Claude Code on complex, multi-step tasks by leveraging persistent memory and checkpoint systems—now available for free with MiMo-V2.5.

VentureBeat4 min read0 Comments

Xiaomi’s MiMo AI team has launched MiMo Code V0.1.0, a terminal-native AI coding assistant designed to tackle long-horizon, multi-step development tasks more effectively than Anthropic’s Claude Code. The tool, released as open source under the MIT license, introduces a cross-session memory system that addresses a critical limitation in AI coding agents: context loss over extended work sessions.

Available immediately on GitHub, MiMo Code integrates seamlessly with developers’ workflows via a simple installation command for macOS and Linux (curl -fsSL | bash) or through npm on Windows (npm install -g @mimo-ai/cli). To further lower the barrier to entry, Xiaomi is offering limited-time free access to its multimodal flagship model, MiMo-V2.5, which boasts a one-million-token context window—no registration required.

How MiMo Code combats AI coding agent amnesia

One of the most persistent challenges in AI-assisted coding is the degradation of performance as context windows fill up. Earlier decisions, conventions, and project states often get compressed or lost, forcing developers to repeatedly re-explain their work. Xiaomi argues this approach is fundamentally unsustainable for complex projects.

"We need a shift from better compression to structured memory management," the MiMo team explained in their launch blog. "Information should be stored, retrieved, and recalled based on relevance—not just retained by brute force."

MiMo Code introduces a four-layer memory architecture to solve this:

  • Project memory – A persistent MEMORY.md file that stores high-level project decisions.
  • Session checkpoints – Structured snapshots of workflow progress.
  • Scratch notes – Temporary annotations for in-the-moment insights.
  • Task progress logs – Granular records of step-by-step execution.

Critically, the system deploys a dedicated "checkpoint-writer" subagent that operates independently of the primary coding agent. While the main agent focuses on building features, the subagent continuously updates structured checkpoints. This mirrors a construction analogy: the primary agent acts as the contractor building the mansion, while the subagent serves as the architect updating the blueprints in real time. When the context window nears its limits, the system can rebuild the environment from these checkpoints, ensuring no loss of operational momentum.

MiMo Code also includes two self-improvement mechanisms:

  • The `/dream` command – Runs periodic reviews (roughly every seven days) to deduplicate and compress historical sessions into long-term memory.
  • The "distill" function – Identifies repetitive workflows from past sessions and automates them, similar to approaches recently adopted by OpenAI and Anthropic.

Benchmark results show advantages in long-horizon tasks

Xiaomi’s internal testing claims MiMo Code paired with MiMo-V2.5-Pro outperforms Claude Code paired with Claude Sonnet 4.6 across three key software engineering benchmarks:

  • SWE-bench Verified: 82% (MiMo) vs. 79% (Claude)
  • SWE-bench Pro: 62% (MiMo) vs. 55% (Claude)
  • Terminal Bench 2: 73% (MiMo) vs. 69% (Claude)

The harness itself contributes measurably to these gains. When running the same MiMo-V2.5-Pro model in both harnesses, MiMo Code scored 62% on SWE-bench Pro versus 57% for Claude Code—a five-point advantage attributable solely to the agent’s system design.

Notably, Xiaomi did not benchmark against OpenAI’s Codex or Google’s Gemini CLI, focusing exclusively on Claude Code as the primary competitor. This strategic choice may reflect the company’s confidence in its memory architecture’s superiority for long-horizon tasks.

Independent benchmarks provide additional context. On the Terminal-Bench 2.0 leaderboard, OpenAI’s Codex CLI running GPT-5.5 scores 82.2%, surpassing MiMo Code’s self-reported 73%. However, on SWE-bench Pro, MiMo Code’s 62% exceeds OpenAI’s reported 58.6% for GPT-5.5. These discrepancies highlight the variability in benchmark configurations and underscore the importance of real-world testing.

Perhaps the most compelling evidence comes from Xiaomi’s internal beta test, which involved 576 developers working across 474 private repositories. In a double-blind A/B evaluation, the systems split wins roughly evenly for tasks under 200 execution steps. However, for tasks exceeding 200 steps, MiMo Code’s win rate climbed to over 65%, reinforcing the tool’s strength in long-horizon development scenarios.

What’s next for AI coding agents?

Xiaomi acknowledges that traditional benchmarks like SWE-bench primarily measure one-shot problem-solving, which doesn’t fully capture MiMo Code’s multi-session design goals. The focus on long-horizon tasks and persistent memory represents a significant evolution in AI-assisted coding, moving beyond simple code generation to include workflow continuity and adaptive learning.

As AI coding agents become more integrated into professional development pipelines, tools like MiMo Code could redefine how developers approach complex projects. With its open-source foundation and free access tier, Xiaomi is positioning itself as a serious contender in the growing market for intelligent coding assistants—one that prioritizes memory, context, and scalability over brute-force context compression.

For now, the question isn’t just whether AI can write code, but whether it can remember why it wrote it—and that may be the difference between a tool and a truly intelligent partner.

AI summary

Xiaomi’nin yeni MiMo Code’u terminalde çalışan açık kaynaklı bir yapay zeka kodlama yardımcısı. Uzun görevlerde rakiplerini geride bırakan aracın özellikleri ve performansı hakkında detaylar.

Comments

00
LEAVE A COMMENT
ID #GG8GZD

0 / 1200 CHARACTERS

Human check

9 + 4 = ?

Will appear after editor review

Moderation · Spam protection active

No approved comments yet. Be first.