Liquid AI has unveiled LFM2.5-230M, its latest AI language model designed to redefine efficiency in on-device workflows. With just 230 million parameters, the model delivers competitive performance in data extraction while maintaining a footprint small enough to run on smartphones, laptops, and robotics—without relying on cloud resources.
The release marks a strategic pivot toward architectural innovation over sheer scale. Unlike traditional transformer models that balloon in size to handle complex tasks, LFM2.5-230M leverages the proprietary LFM2 framework to achieve high inference speeds and low memory usage. According to benchmarks shared by the company, it outperforms models more than four times its size in data extraction tasks, including Alibaba’s Qwen3.5-0.8B and Google’s Gemma 3 1B.
Liquid AI positions the model as a cost-effective solution for developers building lightweight data extraction pipelines and autonomous edge systems. The company offers a dual-use commercial license, making the model free for individuals and small businesses generating under $10 million in annual revenue, with enterprise agreements required for larger corporations.
How LFM2.5-230M achieves efficiency
The LFM2.5-230M model departs from conventional transformer designs by integrating a hybrid architecture that combines gated short-range convolutions with grouped-query attention. This approach enables efficient processing of long contexts and sequential data on edge hardware, avoiding the quadratic memory costs typical of pure attention mechanisms.
Key technical highlights include:
- A 32K context window for ingesting large documents or continuous data streams
- A memory footprint under 400MB
- Prefill and decode speeds that surpass comparable models like Gemma 3 1B IT and Granite 4.0-H-350M
Performance benchmarks reveal its adaptability across devices. On a Samsung Galaxy S25 Ultra with a Qualcomm Snapdragon Gen4 CPU, the model achieves a decode speed of 213 tokens per second. Even on a constrained Raspberry Pi 5, it maintains a decode rate of 42 tokens per second. Liquid AI’s internal testing also shows the GPU inference stack delivers lower end-to-end latency than competing small models at all concurrency levels.
Solving enterprise data challenges with lightweight AI
Traditional data processing pipelines often depend on rigid, rule-based Extract, Transform, Load (ETL) scripts that break when document formats or schemas change. These systems require constant maintenance and fail to adapt to evolving data structures, leading to inefficiencies and errors.
The industry is increasingly adopting "AI ETL," where machine learning models dynamically infer mappings, detect schema drift, and restructure data without manual coding. For example, an AI model can process unstructured sources like PDFs, emails, or web forms and convert them into structured formats like JSON—automating tasks that once required extensive human intervention.
However, deploying large flagship models for such routine tasks is prohibitively expensive. A model like Anthropic’s Claude Opus 4.6, which costs $5.00 per million input tokens, is impractical for parsing invoices, formatting addresses, or routing telemetry data. This is where LFM2.5-230M becomes a game-changer. Its lightweight design allows companies to automate repetitive data workflows locally, reducing both compute costs and latency while eliminating the need for continuous cloud API calls.
A closer look at small AI models in 2025
The AI landscape in 2025 is witnessing a surge in "small" models, though definitions of "small" vary widely. For instance, Weibo’s VibeThinker-3B, a 3-billion-parameter model built on a Qwen2 backbone, recently achieved a 94.3 on the AIME 2026 math benchmark—rivaling models with 600 billion parameters through advanced data curation and reinforcement learning. Similarly, Google’s Gemma 4 family, with over 200 million downloads, pushes efficient AI to edge devices, including the E2B model (2 billion parameters) optimized for mobile and IoT deployments.
LFM2.5-230M operates in a distinct tier, with just 230 million parameters—roughly one-tenth the size of Google’s smallest Gemma 4 model or VibeThinker-3B. While it may not compete in reasoning-heavy tasks like advanced math or coding, it excels in its targeted domains of data extraction and tool integration, delivering performance that belies its compact size.
What’s next for edge AI
Liquid AI’s latest release underscores a growing trend: the shift from cloud-dependent AI to locally deployable models that prioritize efficiency, cost savings, and real-time processing. As edge devices become more powerful, the demand for models that balance performance with hardware constraints will only intensify.
For developers and enterprises, the implications are clear. Models like LFM2.5-230M offer a practical path forward, enabling automation of critical workflows without the overhead of massive infrastructure. As the industry continues to refine architectures for edge deployment, the next frontier may lie in refining these small but mighty models to handle increasingly complex tasks—without compromising on speed or accessibility.
AI summary
Liquid AI’nin LFM2.5-230M modeli, veri çıkarımında devleri geride bırakıyor. Yerel cihazlarda çalışabilen bu model, bulut bağımlılığını ortadan kaldırarak işletmelere yeni fırsatlar sunuyor.

