Html vs Markdown: Rethinking AI Output as Interface
Your AI agent writes 200-line plans. Nobody reads past 20. The problem isn't the content — it's the format. We explore why AI output needs to be an interface, not a document.

Your AI agent writes flawless 200-line plans. Nobody reads past line 20. The problem isn't the content — it's the format.
The Plan Nobody Read
Here's a scene playing out in teams everywhere right now. An AI agent produces a 200-line implementation plan — logically structured, technically sound, properly formatted in Markdown. It lands in the team's Slack channel. Three days later at the weekly standup, the PM says: "I skimmed it."
This isn't a failure of AI capability. The plan was excellent. The problem is something more fundamental: AI keeps getting better at writing. Humans aren't getting better at reading.
In May 2026, Thariq Shihipar — engineering lead for Claude Code at Anthropic — published a post titled "The Unreasonable Effectiveness of HTML." Within 16 hours it had 4.4 million views and 15,700 bookmarks. The thesis was simple and counterintuitive: Markdown, the format the entire AI ecosystem had defaulted to, was actively making agent output harder to consume. He had stopped writing Markdown for almost everything and switched to HTML.
The post didn't just spark a format debate. It exposed a deeper question that every AI workspace needs to answer: when agents produce complex work, how should that work be delivered to the humans who need to act on it?
How Markdown Became the Default (And Why Nobody Questioned It)
To understand the current moment, rewind to 2022. GPT-4 shipped with a context window of 8,192 tokens. The same content that cost roughly 8,000 tokens in HTML needed only about 2,800 in Markdown — a 68% reduction. When your budget was 8K and output cut into input, every token saved was a paragraph preserved. Markdown won on pure economics.
Then came the configuration files. CLAUDE.md. AGENTS.md. SKILL.md. The entire scaffolding of the agentic ecosystem was built in Markdown. When agents saw Markdown everywhere in their context, they naturally produced Markdown as output. Nobody made a deliberate decision to standardize on Markdown for agent deliverables — it just happened, inherited from an era of scarcity.
By 2026, context windows had expanded to one million tokens. The constraint that made Markdown rational had evaporated. But the behavior persisted. As Simon Willison, one of the most respected voices in the AI developer community, admitted: he had been defaulting to Markdown since the GPT-4 days for exactly that reason — and Thariq's post caused him to reconsider.
The Cognitive Cost Nobody Measured
In March 2026, a BCG Henderson Institute study of 1,488 workers, published in Harvard Business Review, put hard numbers on a phenomenon workers had started calling "AI brain fry." The findings were striking:
- Workers with high AI oversight loads reported 19% greater information overload
- 33% more decision fatigue compared to those with low AI oversight
- 39% more major errors at work
- 39% higher intention to quit
The critical insight: brain fry is not caused by using AI. It's caused by monitoring AI output — the cognitive effort of reviewing, evaluating, and correcting what agents produce. And here's where format enters the picture: Markdown does nothing to reduce this monitoring burden. A 200-line Markdown file is a wall of undifferentiated text. No visual hierarchy beyond headers and bold. No navigation. No ability to collapse what you don't need. No way to interact with the content.
The neuroscience supports this. Roughly 30% of the human cerebral cortex is dedicated to visual processing. Hearing gets 3%. Touch gets 8%. Vision is what Andrej Karpathy called "the 10-lane superhighway of information into the brain." Markdown barely uses it. Bold text, headers, and bullet points are the entirety of its visual toolkit.
The 19% increase in information overload documented by HBR won't be solved by writing better Markdown. It will be solved by presenting information in formats the brain can actually process efficiently.
The Core Shift: Output Is Not a Document — It's an Interface
This brings us to the central argument: the output format of an AI agent is not a typographic preference. It's an interface design decision.
Consider the distinction:
Markdown output = Reading Endpoint. Content flows linearly. The human scrolls, reads passively, and either absorbs or abandons. Consumption ends when the document does (or more likely, somewhere around line 40).
HTML output = Interaction Starting Point. Content is structured with tabs, collapsible sections, sortable tables, color-coded severity markers, and inline navigation. The human clicks, filters, annotates, and acts. The output is not the end of the agent's work — it's the beginning of the human's work.
The paradigm shift becomes clear when you look at what AI agents are actually producing in 2026. They don't generate short answers anymore. They produce implementation plans, code review reports, competitive analyses, design explorations, data summaries. These are complex deliverables that require human review, judgment, and action.
When a deliverable is that complex, format is no longer about aesthetics. It's about whether the human can effectively exercise oversight. As Thariq put it: "I feel more in the loop than ever when using HTML." Richer output formats don't just look better — they restore the human's sense of agency over AI work.
This is not a trivial point. The Epsilla engineering blog framed it precisely: "Markdown encourages passivity, leading to default trust and a gradual erosion of control. HTML makes the AI's reasoning transparent and interactive, empowering rigorous review." In an era where AI agents execute increasingly complex workflows, the human's ability to supervise effectively depends on the interface through which they receive the agent's work.
What HTML Actually Gives You: Five Scenarios
Thariq published a companion site with 20 self-contained HTML files, each illustrating a real use case. Here are five scenarios where the difference is most pronounced:
Implementation Plans. In Markdown: a 200-line linear scroll. In HTML: tabbed navigation across workstreams, collapsible phase details, an embedded timeline visualization, and a risk matrix with color-coded severity. The same information, but one version gets read and the other gets skimmed.
Code Reviews. In Markdown: plain text diff with inline comments. In HTML: the actual diff rendered with syntax highlighting, margin annotations color-coded by severity (red/amber/green), jump links to each finding, and a summary panel showing the overall assessment at a glance.
Option Comparisons. In Markdown: sequential paragraphs describing each option. In HTML: side-by-side columns with color-coded differences, a verdict box at the bottom, and a scoring matrix the reviewer can interact with.
Design Explorations. In Markdown: textual descriptions of four design directions. In HTML: four complete visual mockups with full-screen previews, each rendered as a working interface you can click through.
Data Reports. In Markdown: ASCII tables that break on mobile. In HTML: sortable, filterable tables with inline SVG charts, responsive layout that adapts to screen size, and hover tooltips for contextual detail.
In each case, HTML doesn't win because it's prettier. It wins because it delivers higher information density in a format the human brain can actually process — and because it transforms output from something you read into something you work with.
The Format Layer Principle: Each Layer Gets Its Own Format
The conclusion from this analysis is not "Markdown is dead." It's more nuanced: different layers of the AI workflow demand different formats, and the industry is converging on a clear pattern.
Input layer (Human → AI): Markdown remains optimal. System prompts, configuration files, and RAG pipelines all benefit from Markdown's token efficiency and structural clarity. Studies show RAG accuracy improves by up to 35% when ingesting Markdown over raw HTML.
Reasoning layer (AI → AI): Structured data formats — JSON, YAML — are most efficient. Agents don't need colors or layout when communicating with other agents. They need parseable, typed data.
Delivery layer (AI → Human): HTML wins. When the primary reader is a human who needs to review, understand, and act on complex output, visual hierarchy, navigation, and interactivity aren't luxuries — they're necessities.
The decision rule is simple: if the output's primary reader is another LLM, use Markdown. If the primary reader is a human who needs to review and act, use HTML.
The Other Side: Costs and Risks of Rich Output
Intellectual honesty requires acknowledging the tradeoffs:
Token cost. Clean HTML consumes roughly 3× more tokens than equivalent Markdown. HTML with embedded CSS and JavaScript can balloon to 8–10×. For high-throughput pipelines generating hundreds of outputs per hour, this cost is real.
Security. AI-generated HTML can contain JavaScript, opening the door to cross-site scripting and injection attacks. Google's Agent-to-UI (A2UI) protocol exists specifically because enterprise security teams cannot accept agents writing arbitrary HTML that executes in production environments. Sandboxed rendering is mandatory.
Accessibility. AI-generated HTML typically lacks ARIA attributes, descriptive alt text, and consistent tab order. Standard Markdown converters produce semantic headings and image alts by default. HTML requires explicit prompting for WCAG 2.2 AA compliance.
Version control. HTML diffs are noisy — full of closing tags and attribute changes that obscure the actual content change. For teams that rely on Git-based review workflows, this friction is real.
None of these are unsolvable. Sandboxed iframes address security. Accessibility constraints can be embedded in agent prompts. Token costs are declining as context windows expand. But they're worth acknowledging because they define the engineering work required to make rich output production-ready.
What This Means for AI Workspace Products
For teams building AI workspace products, the format question carries direct product implications:
The rendering layer is a competitive surface. The workspace that automatically translates agent reasoning into human-consumable rich output — without requiring users to write "please output as HTML" — delivers a materially better experience. The format translation should happen at the platform level, not in user prompts.
Security must be built in, not bolted on. Sandboxed HTML rendering inside the workspace environment, with CSP headers and script isolation, enables rich output without the security risks of raw HTML in production. This is infrastructure work, but it's infrastructure that directly improves the human-agent interaction.
Output should be a workflow starting point. The tables should be sortable. The plans should be annotatable. The code should be runnable. The recommendations should have one-click action buttons. When agent output becomes an interactive artifact rather than a static document, the workspace transforms from a place where you read AI results to a place where you act on them.
Who's Really Driving?
The Markdown-versus-HTML conversation is ultimately about something bigger than file formats. It's about the relationship between humans and AI agents in 2026.
As agents grow more capable — running for hours, producing thousands of lines, orchestrating multi-step workflows — the human's role shifts from doing the work to directing and reviewing the work. But effective oversight requires effective interfaces. A 200-line wall of Markdown text isn't oversight — it's the illusion of oversight.
The BCG study showed that when AI oversight becomes cognitively overwhelming, workers default to trusting the output without critical review. That's the worst outcome: humans nominally in the loop, but functionally rubber-stamping agent work they haven't actually processed.
Richer output formats don't solve every problem. But they address a critical gap: they give humans the visual and interactive tools to actually exercise the judgment that human-in-the-loop is supposed to provide. The format of AI output determines who's actually driving — the human reviewing the work, or the agent that produced it.
If your AI agent is producing plans nobody reads, the problem might not be the agent. It might be the way it delivers its work.
This article is part of the wukong.ai blog series exploring the design principles of AI-native workspaces. Follow us for more insights on human-agent collaboration.