The ability to acquire new knowledge after training remains a significant hurdle for enterprise AI, with current solutions being too expensive, slow, or constrained by context window limits. Researchers have introduced MeMo, a framework that encodes new knowledge into a dedicated smaller memory model, operating separately from the main large language model. This modular architecture works with both open- and closed-source models, sidestepping the complexity of RAG pipelines and full model retraining.
The Challenge of Updating LLM Memory
Large language models are frozen after training, with their internal knowledge remaining static until they undergo subsequent, computationally massive updates. Developers rely on three main approaches to integrate external knowledge into an LLM, each with distinct drawbacks: non-parametric methods, parametric methods, and latent memory methods. Non-parametric methods, such as retrieval-augmented generation, retrieve relevant documents from an external database and insert them directly into the model's prompt. However, these methods are limited by context window sizes and can create substantial computational overhead.
How MeMo Works
The MeMo framework introduces a modular architecture featuring two separate components: the MEMORY model and the EXECUTIVE model. The MEMORY model is a small language model trained specifically to encode new knowledge into its parameters, while the EXECUTIVE model is a frozen, off-the-shelf LLM that functions as the reasoning engine. When a user asks a question, the EXECUTIVE model treats the MEMORY model as an external oracle, issuing targeted sub-queries to gather facts and synthesizing those facts into a final answer. The core design principle driving MeMo is the concept of 'reflections,' which are targeted question-answer pairs designed to capture every possible angle of a knowledge corpus.
Handling Continual Knowledge Updates
Managing an AI's memory requires continuous updates as company policies change and new reports are published. MeMo relies on a technique called 'model merging' to handle continual updates efficiently. Instead of a massive joint retraining phase, MeMo trains a new, independent MEMORY model exclusively on the newly added documents. This approach enables cost-effective continuous knowledge updates, making it an attractive solution for enterprise AI applications.
As the field of AI continues to evolve, the development of innovative frameworks like MeMo is crucial for addressing the challenges associated with large language models. With its modular architecture and ability to handle continual knowledge updates, MeMo has the potential to revolutionize the way we approach AI memory and update large language models, enabling more efficient and effective AI applications in various industries.
AI summary
MeMo, büyük dil modellerinin belleğini güncellemenin yeni bir yolu sunuyor ve performansı %26 artırıyor. MeMo, açık kaynaklı ve kapalı kaynaklı modellerle çalışabiliyor.
