Developers are increasingly turning to HTML instead of markdown for documentation and reviews, but the shift comes with unexpected costs. While HTML enables interactive widgets, embedded diagrams, and semantic structure, it also generates up to four times more tokens than markdown—directly impacting API bills. The debate isn’t just about formatting; it’s about balancing richer outputs with financial efficiency in AI-driven workflows.
The rise of HTML in AI-generated content
On May 8, Thariq Shihipar, a member of the Claude Code team at Anthropic, sparked a conversation with a simple yet provocative observation: HTML is now the new markdown. In a post on X, he argued that HTML offers capabilities markdown can’t match, including inline SVGs, interactive toggles, color-coded severity indicators, and real-time navigation. These features transform static documentation into dynamic artifacts, making reviews and architectural discussions more intuitive.
Shihipar’s example—prompting an AI to generate an HTML-based PR review with color-coded findings—highlights how HTML can create a dashboard-like experience rather than a plain-text document. The output resembles a lightweight internal tool, where markdown would fall short. Simon Willison, a well-known developer and writer, echoed the sentiment, emphasizing HTML’s potential to bridge the gap between raw text and interactive interfaces.
The hidden costs of HTML output
The debate took a sharp turn when critics like Kurtis Redux challenged the practicality of HTML adoption. The core issue? Cost. Generating HTML requires 2–4 times more output tokens than markdown, and since output tokens are 3–5 times more expensive than input tokens, the financial impact adds up quickly. Longer outputs also risk diluting model focus, leading to verbose or inconsistent results.
Beyond token economics, the critique extended to the incentives of the companies advocating for HTML. If an AI provider benefits from increased token usage, are its recommendations truly objective? Shihipar’s transparency about efficiency trade-offs softens the argument, but the structural concern remains: tooling recommendations may align with corporate incentives.
Beyond HTML: where the real token waste happens
Focusing solely on HTML vs. markdown overlooks the bigger picture. In most AI-assisted workflows, the bulk of token consumption occurs long before the model produces any output. Three major contributors dominate the cost:
- System prompts and tool definitions: Tools like Claude Code include 50+ functions, each with verbose JSON schemas. These are resent with every interaction, consuming tens of thousands of input tokens before the user even begins.
- Conversation history: Every prior assistant message, tool call, and result is replayed in each turn. A single session can accumulate hundreds of thousands of tokens, especially when debugging complex issues.
- Tool results: Running commands like
cargo build,pytest -v, orgit diffcan dump tens of thousands of tokens of logs, stack traces, or diffs—often irrelevant to the task at hand.
HTML output, while more expensive per token, is typically a small fraction of the total cost. The real opportunity lies in optimizing the surrounding processes rather than debating formatting choices.
Practical strategies to reduce token waste
For teams looking to adopt HTML without incurring astronomical costs, the solution isn’t to avoid HTML—it’s to optimize the workflow around it. Proxy tools like Lynkr demonstrate how to intercept and streamline AI agent interactions before they generate unnecessary tokens.
One key strategy is preflight short-circuiting, where a lightweight shell command verifies whether work is already completed before triggering an AI interaction. For example, if a CI system detects that tests are passing, an agent no longer needs to regenerate HTML documentation or reviews. This approach eliminates entire AI loops, saving hundreds of thousands of tokens in idle retries.
Another tactic is smart tooling, where only essential context is sent to the model. By stripping redundant schemas, filtering verbose logs, or prioritizing relevant data, teams can reduce input token counts without sacrificing functionality. For instance, caching tool results or summarizing long outputs can cut down on repeated transmissions.
The future of AI-assisted documentation
HTML’s rise reflects a broader shift toward richer, interactive documentation in developer workflows. Tools that blend semantic markup with dynamic elements are redefining how teams collaborate, review code, and document systems. However, the transition must balance innovation with efficiency.
The next phase of development will likely focus on context-aware agents that dynamically adjust their outputs based on cost, relevance, and user intent. Whether through proxy optimizations, smarter prompting, or hybrid formats, the goal is clear: deliver the benefits of HTML without the prohibitive price tag. For teams willing to experiment, the tools to achieve this are already emerging.
AI summary
AI destekli içerik oluşturma araçlarında HTML ve markdown arasındaki tercih token maliyetlerini belirliyor. Gelişmiş AI çıktıları için hangi biçimlendirme en verimli? Detaylı analiz.