How BoxAgnts unifies LLM APIs with provider abstraction and streaming loops

BoxAgnts addresses a common pain point in AI development: the lack of standardization across large language model (LLM) APIs. Different vendors use vastly different request formats, response structures, and streaming protocols, forcing developers to write vendor-specific code. The project’s solution is a layered abstraction system that decouples the agent query loop from underlying API differences, enabling seamless multi-provider integration.

The challenge of fragmented LLM APIs

AI model providers vary significantly in how they structure requests and responses. Anthropic’s Claude API separates the system prompt into a dedicated field, while OpenAI embeds it as a message with the "system" role. Google Gemini places the system instruction at the top level of the request body but with yet another format. Error handling and streaming protocols add further complexity, making direct integration impractical for large-scale applications.

BoxAgnts solves this by introducing three abstraction layers that shield the agent loop from vendor-specific details.

Layer 1: Unified data models for requests and responses

The foundation is a standardized data model that all providers adhere to. The ProviderRequest struct encapsulates essential parameters like message history, system prompt, tool definitions, token limits, and temperature settings. Similarly, ProviderResponse consolidates content blocks, token usage metrics, and termination reasons.

// provider_types.rs
pub struct ProviderRequest {
    pub messages: Vec<ApiMessage>,
    pub system: Option<String>,
    pub tools: Vec<ApiToolDefinition>,
    pub max_tokens: u32,
    pub temperature: Option<f32>,
}

pub struct ProviderResponse {
    pub content: Vec<ContentBlock>,
    pub usage: UsageInfo,
    pub stop_reason: String,
}

This unified interface allows the agent loop to work with a single abstraction regardless of the underlying provider, eliminating the need for branching logic based on provider IDs.

Layer 2: Trait-based provider interfaces

The LlmProvider trait defines a common interface for all providers, requiring implementations to handle authentication, request construction, and streaming. The trait includes methods for streaming message generation and model listing, returning a unified Stream type that abstracts over vendor-specific implementations.

pub trait LlmProvider: Send + Sync {
    fn id(&self) -> &ProviderId;
    async fn create_message_stream(
        &self,
        request: ProviderRequest,
    ) -> Result<Pin<Box<dyn Stream<Item = Result<StreamEvent, ProviderError>> + Send>>>;
    async fn list_models(&self) -> Result<Vec<ModelInfo>>;
}

Each provider implementation internally manages its own HTTP requests, authentication tokens, and Server-Sent Events (SSE) parsing, exposing a consistent interface to the rest of the system.

Layer 3: Transformers for message format conversion

Transformers handle the final conversion from the unified ProviderRequest to vendor-specific formats. These are pure functions that take a standardized request and produce API-specific payloads. For example, the Anthropic transformer maps the unified system prompt field to the vendor’s top-level system parameter, while the OpenAI transformer embeds it as a dedicated message.

Adding support for a new provider only requires implementing a new transformer and corresponding LlmProvider trait instance. The ProviderRegistry maintains a registry of providers, enabling dynamic selection at runtime based on configuration.

// transformers/anthropic.rs
pub fn to_anthropic_request(req: &ProviderRequest) -> AnthropicMessagesRequest {
    AnthropicMessagesRequest {
        model: req.model.clone(),
        system: req.system.clone(),
        messages: req.messages.clone(),
        tools: req.tools.clone(),
        max_tokens: req.max_tokens,
        temperature: req.temperature,
    }
}

Streaming protocol unification

Streaming responses present another layer of complexity due to vendor-specific event structures. Anthropic uses a hierarchical event model with content_block_start, content_block_delta, and content_block_stop events. OpenAI employs a flat delta structure in its choices[0].delta field, while Google Gemini uses gRPC-web with its own streaming format.

BoxAgnts’ stream_parser module normalizes these differences by converting vendor-specific SSE events into a unified StreamEvent enum:

pub enum StreamEvent {
    TextDelta { text: String },
    ToolUseStart { id: String, name: String },
    ToolUseDelta { id: String, json: String },
    ToolUseEnd { id: String },
    ThinkingDelta { text: String },
    UsageUpdate { input_tokens: u32, output_tokens: u32 },
    MessageStop,
}

Each provider’s parser acts as a finite state machine to interpret vendor events and emit the appropriate standardized events. The StreamAccumulator maintains the state of all content blocks in the current message, assembling them into a complete Message once the stream terminates.

The agent query loop in action

The agent query loop orchestrates the entire process, from message streaming to tool execution. It follows a three-step cycle:

Request construction: The loop builds a standardized ProviderRequest using message history, system prompt, and tool definitions, then delegates to the selected provider.
Stream processing: The provider’s streaming interface is initiated, and events are parsed into the unified StreamEvent format. The loop handles real-time events like tool usage starts and deltas, forwarding them to the frontend via WebSocket for live updates.
Response assembly: Once the stream completes, the accumulator finalizes the message by combining all content blocks and usage metrics, returning the stop reason and completed response.

This architecture ensures that developers can focus on building agentic systems without worrying about the underlying LLM provider’s idiosyncrasies. As AI model APIs continue to evolve, BoxAgnts’ abstraction layers provide a stable foundation for cross-vendor compatibility.

Looking ahead, this system paves the way for more sophisticated multi-model orchestration, where agents can dynamically switch providers based on cost, latency, or performance requirements without code changes.

AI summary

BoxAgnts’in çoklu yapay zeka sağlayıcı uyumu nasıl sağlıyor? Provider soyutlama katmanı ve Akıllı Sorgulama Döngüsü hakkında detaylı inceleme.

How BoxAgnts unifies LLM APIs with provider abstraction and streaming loops

The challenge of fragmented LLM APIs

Layer 1: Unified data models for requests and responses

Layer 2: Trait-based provider interfaces

Layer 3: Transformers for message format conversion

Streaming protocol unification

The agent query loop in action

Comments

Mastering Pull Requests: Lessons from Rejected Code in LTI 1.3 Integration

Mastering AtCoder ABC462 Solutions with Python Examples

PHP SDKs streamline API integration for cleaner code