Stateful AI Agents: Build Self-Learning Systems Beyond Prompts

AI systems today operate like forgetful assistants—erasing every conversation the moment it ends. This stateless design forces developers to resend entire histories with each query, creating inefficiency and memory limits. The solution lies in stateful agents: systems that persist knowledge, refine skills, and evolve alongside users.

Below, we explore the architecture behind Hermes Agent, a stateful AI framework that mirrors how professionals retain expertise. From core identity files to dynamic skill libraries, we’ll break down the engineering patterns that enable true autonomy—and provide a Python implementation to build your own self-improving agent.

The Stateless Trap: Why Most AI Tools Fail to Adapt

Stateless AI systems function like vending machines—they process inputs in isolation, with no recollection of past interactions or user preferences. Every request triggers a fresh context load, forcing developers to reconstruct conversation histories repeatedly.

Consider this Python function, a textbook example of statelessness:

import datetime

def parse_date(date_string: str) -> datetime.datetime:
    return datetime.datetime.strptime(date_string, "%Y-%m-%d")

This routine has no memory of previous inputs or user habits. It doesn’t adapt to regional date formats or optimize based on repeated use. When developers attempt to simulate continuity by stitching chat histories together, three critical flaws emerge:

Context overload: Token usage grows exponentially with conversation length, straining budgets and model performance.
Memory erosion: Context windows cap out, deleting older interactions that may contain crucial context.
Knowledge fragmentation: Lessons learned in one session vanish entirely in the next, forcing repeated discovery of solutions.

Stateful agents eliminate these issues by treating memory as a persistent, evolving system—akin to a skilled artisan carrying forward lessons from past projects.

Designing Stateful Intelligence: Memory, Identity, and Skills

The Hermes Agent framework structures statefulness into three interconnected layers that mirror human cognitive organization:

1. Soul: The Agent’s Core Identity

The SOUL.md file acts as the agent’s foundational constitution, defining its communication style, ethical boundaries, and operational principles. Rather than dynamic data, this document serves as immutable guidance injected into every system prompt.

For example, a developer-focused agent might include:

Preferred coding standards (e.g., PEP 8 compliance)
Response tone (concise technical explanations)
Security protocols (no direct file modifications)

A helper class reads this markdown file and embeds it into the LLM’s system prompt, ensuring consistent behavior regardless of task complexity.

2. Memory: Episodic and Semantic Knowledge

Memory in stateful agents splits into two curated stores:

USER.md: Static user attributes like programming languages, OS preferences, or working hours. This file grows only with deliberate updates.
MEMORY.md: Dynamic episodic knowledge such as database ports, recurring bugs, or user-specific workflows. This store evolves organically during tasks.

A MemoryStore class manages these files, enabling the agent to:

Retrieve context for new queries (e.g., "Remember the user prefers pytest over unittest")
Append new insights (e.g., "Fixed the Docker network conflict by restarting the container")
Prune irrelevant data automatically

Semantic search tools enhance retrieval by mapping natural language queries to stored memories, avoiding the need to sift through raw transcripts.

3. Skills: Reusable Procedural Knowledge

Skills represent the agent’s "how-to" knowledge—packaged toolkits for recurring tasks. In Hermes, each skill is a self-contained directory within ~/.hermes/skills/, containing:

Configuration files defining trigger phrases (e.g., "deploy to staging")
Python modules with reusable functions (e.g., database_backup())
Documentation explaining usage and dependencies

Skills operate like macros, allowing the agent to:

Chain operations (e.g., "Backup database → Run migrations → Deploy app")
Share expertise across user sessions
Update toolkits independently of the agent’s core code

For instance, a docker_debug skill might include:

# ~/.hermes/skills/docker_debug/actions.py
import subprocess

def inspect_container(container_name: str) -> str:
    """Check container logs and return error insights."""
    cmd = f"docker logs {container_name} --tail 50"
    result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
    return result.stdout

Building Your Own Stateful Agent: A Step-by-Step Guide

Implementing statefulness requires three core components:

1. Initialize the Memory System

Start by creating a MemoryManager class to handle file operations and semantic indexing:

from pathlib import Path
import json
from sentence_transformers import SentenceTransformer

class MemoryManager:
    def __init__(self, memory_dir: str = "~/.hermes/memory"):
        self.memory_dir = Path(memory_dir).expanduser()
        self.model = SentenceTransformer('all-MiniLM-L6-v2')
        self._ensure_directories()
    
    def _ensure_directories(self):
        self.memory_dir.mkdir(parents=True, exist_ok=True)
        (self.memory_dir / "episodic").mkdir(exist_ok=True)
        (self.memory_dir / "semantic").mkdir(exist_ok=True)

2. Design the Skill Architecture

Structure skills as modular packages with a standardized interface:

~/.hermes/skills/
├── docker_debug/
│   ├── __init__.py
│   ├── actions.py
│   └── config.json
└── git_workflow/
    ├── __init__.py
    ├── actions.py
    └── README.md

Each skill’s __init__.py should expose a run() method accepting JSON-serialized parameters:

# docker_debug/__init__.py
from .actions import inspect_container

def run(params: dict) -> dict:
    container = params.get("container_name", "app")
    logs = inspect_container(container)
    return {"status": "success", "logs": logs}

3. Integrate with an LLM

Use the agent’s SOUL.md to guide the LLM’s responses while delegating stateful operations to your memory and skill systems:

from openai import OpenAI

class HermesAgent:
    def __init__(self, soul_path: str = "~/.hermes/SOUL.md"):
        self.soul = Path(soul_path).read_text()
        self.client = OpenAI()
        self.memory = MemoryManager()
        self.skills = SkillRegistry()
    
    def process_query(self, query: str) -> str:
        # Retrieve relevant context
        user_prefs = self.memory.load_user_preferences()
        episodic_memories = self.memory.search_episodic(query)
        
        # Inject context into system prompt
        system_prompt = f"""{self.soul}
        User preferences: {json.dumps(user_prefs)}
        Relevant memories: {episodic_memories}
        """
        
        # Generate response
        response = self.client.chat.completions.create(
            model="gpt-4",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": query}
            ]
        )
        return response.choices[0].message.content

The Future of Stateful AI: Beyond Static Prompts

Stateful agents represent a paradigm shift from prompt engineering to persistent intelligence. By decoupling memory, identity, and skills into modular systems, developers can create AI assistants that:

Reduce token waste by eliminating redundant context
Maintain continuity across sessions and projects
Evolve without requiring full retraining
Operate with human-like adaptability

The Hermes Agent framework demonstrates that statefulness isn’t just an architectural choice—it’s the foundation for AI systems that can truly collaborate. As memory systems grow more sophisticated and skills become shareable across users, we’re moving closer to agents that don’t just answer questions but grow alongside their human counterparts.

AI summary

Learn how to build stateful AI agents with persistent memory and self-learning loops. Explore the Hermes Agent architecture with a hands-on Python implementation.