AI agents built with the Model Context Protocol (MCP) often hit invisible walls when calling external APIs. A tool that works perfectly in testing can freeze an entire workflow when the backend service slows down or fails. The result isn’t just a delay—it’s a frozen interface, a time-out error, and a user staring at a spinning icon with no feedback.
The solution isn’t faster code or longer timeouts. It’s changing the interaction model entirely. Instead of waiting for a slow API to respond, agents can return immediately with a handle ID and poll for results later. This pattern keeps the user experience smooth and prevents the dreaded 424 (Failed Dependency) error.
Why MCP Tools Freeze When APIs Slow Down
MCP tools are designed for fast, predictable responses. The protocol expects tools to return control to the agent within about 7 to 10 seconds. When an MCP tool calls a slow external API—like a data pipeline, a batch job, or a third-party service—it blocks the agent until the call completes. If the call takes longer than the timeout, the connection drops with a 424 error. The agent never gets the data, and the user gets no explanation.
Real-world reports from the AI community confirm this pattern:
- Developers report 424 errors when MCP tools depend on remote servers that time out.
- Agents sometimes hang indefinitely, neither failing nor completing.
- Tools pass initial validation but stall during execution, leaving users in limbo.
These aren’t edge cases. As AI agents integrate with more external systems, timeouts become the norm, not the exception.
The Three Failure Modes of Slow MCP Calls
MCP tools fail in predictable ways when external APIs slow down:
- Slow API response: The tool waits 15 seconds or more, eventually returning data—but only after a frustrating delay that breaks the user experience.
- Failing API: The external service is unreachable, triggering a 424 error after the timeout threshold is exceeded.
- Unresponsive state: The request is accepted but never returns, forcing users to restart their sessions.
Each of these scenarios leaves the agent stuck, the user confused, and the workflow broken.
How the Async HandleId Pattern Works
The async handleId pattern flips the script. Instead of waiting for a slow operation to finish, the MCP tool returns immediately with a job tracking ID. The agent can display a confirmation to the user, then poll for results later. This keeps the UI responsive and avoids timeouts entirely.
Here’s how it works in practice:
- The agent calls a tool to start a long-running job.
- The tool returns a unique job ID instead of waiting for the result.
- The agent shows the user: “Job started. Check back soon.”
- The agent periodically polls another tool to check the job’s status.
- When the job completes, the agent receives the result and updates the user.
This approach is framework-agnostic. It works with any agent system that supports MCP, including Strands Agents and others.
A Working Example: Simulating Real Timeout Scenarios
To demonstrate the problem and solution, we built a lightweight MCP server using FastMCP, a framework for creating MCP servers in Python. The server simulates three real-world scenarios:
from mcp.server import FastMCP
import asyncio
mcp = FastMCP("Timeout Demo Server")
@mcp.tool(description="Fast API - responds in 1 second")
async def fast_api(query: str) -> str:
await asyncio.sleep(1)
return f"Fast result for: {query}"
@mcp.tool(description="Slow API - responds in 15 seconds")
async def slow_api(query: str) -> str:
await asyncio.sleep(15)
return f"Slow result for: {query}"
@mcp.tool(description="Failing API - returns 424 after delay")
async def failing_api(query: str) -> str:
await asyncio.sleep(7)
raise Exception("Failed Dependency: External service unavailable")The baseline case (fast_api) completes quickly and stays within MCP’s expected response window. But the slow_api and failing_api tools trigger the very problems developers report: slow responses, timeouts, and failed dependencies.
Implementing the Async HandleId Pattern
Here’s a complete implementation of the handleId pattern, including a job store and polling mechanism:
import uuid
# In-memory job store (replace with Redis or DynamoDB in production)
JOBS = {}
@mcp.tool(description="Start a long-running job, returns immediately with job ID")
async def start_async_job(query: str) -> str:
job_id = str(uuid.uuid4())[:8]
JOBS[job_id] = {
"status": "processing",
"query": query,
"result": None
}
# Fire-and-forget: run the slow work in the background
asyncio.create_task(do_work(job_id, query))
return f"Job started: {job_id}. Use check_job_status to poll for results."
@mcp.tool(description="Check status of a running job")
async def check_job_status(job_id: str) -> str:
job = JOBS.get(job_id)
if not job:
return f"Job {job_id} not found"
if job["status"] == "completed":
return f"COMPLETED: {job['result']}"
return f"PROCESSING: Job {job_id} still running"
# Background worker
async def do_work(job_id: str, query: str):
try:
# Simulate slow work (e.g., calling an external API)
await asyncio.sleep(15)
result = f"Processed: {query}"
JOBS[job_id]["result"] = result
JOBS[job_id]["status"] = "completed"
except Exception as e:
JOBS[job_id]["status"] = "failed"
JOBS[job_id]["error"] = str(e)The agent receives an immediate response with a job ID, then polls check_job_status to track progress. This eliminates blocking, prevents timeouts, and keeps the user informed.
Real-World Impact: From 17 Seconds to 3 Seconds
We tested the async pattern against the baseline and problematic scenarios:
| Scenario | Total Response Time | User Experience | Research Finding | |------------------------|---------------------|-----------------------|--------------------------------------| | Fast API (1s delay) | 3.2s | Good | Baseline | | Slow API (15s delay) | 17.8s | Poor — agent waits | Octopus: "agent waits indefinitely" | | Failing API (424) | 7.7s | Poor — error after wait| OpenAI Community: 424 errors | | Async pattern (handleId)| 3.7s | Good — immediate response| Solution: "respond ASAP with handleId" |
The async pattern turns a 17.8-second wait into a 3.7-second response. The agent can confirm the job started instantly and update the user without freezing the interface.
Moving Forward: Build Reliable AI Agents Today
Timeouts aren’t a bug in MCP—they’re a feature of the protocol’s design. But they don’t have to break your agent’s workflow. By adopting the async handleId pattern, you can keep agents responsive even when external APIs slow down.
The next step? Replace the in-memory job store with a persistent solution like Redis or DynamoDB, and integrate polling into your agent’s logic. With these changes, your AI agents will handle slow APIs gracefully and keep users in the loop every step of the way.
AI summary
MCP araçlarının yavaş API'ler nedeniyle donmasını önlemenin en etkili yolu: async handleId modeli ile anında yanıt ve kullanıcı dostu iş akışı.