When an AI agent sends an email and later receives a reply hours or days later, determining which conversation that reply belongs to can be tricky. Most email clients handle threading automatically, but when you build an agent that interacts with email, you must manage this context yourself. The key lies in three email headers that create an unbreakable chain linking messages in a conversation.
The invisible backbone of email conversations
Every email contains a globally unique identifier called a Message-ID, assigned by the sending server. When someone replies, their email client populates two special headers to maintain the conversation thread:
In-Reply-Topoints to the exact message being responded to.Referencesaccumulates the entire chain of messages, from oldest to newest.
Consider this example flow:
Message-ID: <abc123@agents.yourcompany.com>
Subject: Following up on your demo request
Message-ID: <def456@gmail.com>
In-Reply-To: <abc123@agents.yourcompany.com>
References: <abc123@agents.yourcompany.com>
Message-ID: <ghi789@agents.yourcompany.com>
In-Reply-To: <def456@gmail.com>
References: <abc123@agents.yourcompany.com> <def456@gmail.com>Major email clients like Gmail, Outlook, Apple Mail, and Thunderbird rely exclusively on these headers to group messages. Subject lines act as a fallback, not the primary mechanism.
Why subject matching fails in real-world use
Many developers initially attempt to match replies by searching for Re: prefixes in subject lines. While this approach might work in controlled demos, it breaks down in production for three common reasons:
- Subjects get edited. A reply might return as "Re: Q3 budget review — updated numbers attached" instead of the original "Q3 budget review."
- Subjects collide. Two separate conversations might start with the same subject line, causing replies to be misrouted.
- Forwards create confusion. A forwarded email may generate replies with the same subject but no relation to the original thread.
Headers reference immutable Message-ID values, not human-editable text, making them immune to these issues. Header-based matching should always be prioritized, with subject matching reserved for cases where headers are missing—typically in legacy or improperly configured mail clients.
Platforms that handle threading automatically
Building email agents from scratch requires deep understanding of threading mechanics. However, platforms like Nylas offer hosted solutions that abstract away this complexity. Through their Agent Accounts (currently in beta), developers can send emails via three methods while preserving conversation threads:
- API sends: Use the
reply_to_message_idparameter in thePOST /v3/grants/{grant_id}/messages/sendendpoint. The system automatically fetches the originalMessage-IDand populates theIn-Reply-ToandReferencesheaders. - SMTP submission: Headers set by email clients remain intact when messages are sent via ports 465 or 587.
- Inbound handling: Full headers are stored upon arrival. Developers can retrieve them with either
fields=include_headersfor complete data orfields=include_basic_headersto fetch just the three critical threading headers—significantly reducing payload sizes, as full headers often exceed the message body in size.
Even when traffic mixes API sends with human responses via IMAP, the platform’s Threads API maintains coherent conversations by following the header chain rather than the sending method.
Using thread_id as your conversation key
Instead of manually parsing headers, leverage the Threads API to access conversation context directly. When an email is created, a webhook delivers a thread_id that identifies the entire conversation. A single API call retrieves the full thread:
curl --request GET \
--url " \
--header "Authorization: Bearer $NYLAS_API_KEY"Each thread object includes essential metadata such as:
message_idsin chronological orderparticipantsinvolved in the conversationlatest_message_received_datefor tracking activity- Routing flags like
unreadandfolders
The Nylas documentation advises treating thread_id as the primary key for conversation context. It’s more reliable than raw headers because it’s platform-assigned and spans the entire conversation, not just individual messages.
To reconstruct the conversation’s content for an LLM, fetch the ordered list of messages:
// After receiving a message.created webhook
const thread = await nylas.threads.find({
identifier: AGENT_GRANT_ID,
threadId: message.thread_id,
});
// thread.data.messageIds contains the full chain
const messages = await Promise.all(
thread.data.messageIds.map((id) =>
nylas.messages.find({
identifier: AGENT_GRANT_ID,
messageId: id,
})
)
);This ordered list provides the exact conversation history needed to feed into a language model.
Mapping threads to your application’s workflow
Email threading ensures messages are grouped correctly, but it doesn’t determine which task the conversation belongs to. That mapping must be handled in your application logic:
- On outbound sends: Store the
message_idandthread_idwith your internal state, linking to session IDs, CRM deals, support tickets, or workflow steps. - On inbound replies: When a webhook arrives, look up the
thread_id. A match indicates a reply to an agent-initiated message—restore context and continue the task. A miss indicates a new conversation—classify and route accordingly.
Implement this mapping in code:
// After sending an outbound message
threadState.set(sentMessage.threadId, {
sessionId: currentSession.id,
taskId: currentTask.id,
step: "awaiting_reply",
sentAt: Date.now(),
});
// On receiving an inbound webhook
const context = threadState.get(inboundMessage.threadId);
if (context) {
await resumeTask(context.taskId, inboundMessage);
} else {
await triageNewMessage(inboundMessage);
}Store this mapping in a database rather than memory, as email conversations often span hours or days. An in-memory solution won’t survive service restarts, potentially losing critical context.
Conclusion
Email threading might seem like a technical detail buried in obscure headers, but it’s the foundation that enables AI agents to maintain coherent conversations across days or even weeks. By understanding how Message-ID, In-Reply-To, and References work—or by leveraging platforms that handle this automatically—developers can build agents that respond intelligently without getting lost in message chaos.
AI summary
Discover how email threading headers keep AI agent conversations coherent. Learn best practices for managing message context without manual parsing.