How Distill cuts Claude Code context.
Distill handles large outputs before they reach the model: logs, diffs, source files, and tool chains. Three MCP tools reduce each type of context at the source.
Claude Code receives every tool result.
A build command can return thousands of tokens for a few useful error lines. Loading a whole file to inspect one function creates the same problem.
Those results become part of the session context. Every later read, response, and tool call has to work around that raw content.
Distill processes the result before Claude Code receives it. It compresses the output or selects only the requested structure, then sends a smaller result back to Claude Code.
Three tools. Three specific jobs.
Distill runs as a local open-source MCP server. Claude Code calls the right tool instead of loading the raw content directly.
- auto_optimize
- Detects logs, diffs, stack traces, and code blocks, then applies the matching compression strategy. Observed reduction: 40 to 95%.
- smart_file_read
- Reads code through the AST across 7 languages. Skeleton, extract, and search modes return only the requested structure or symbol.
- code_execute
- Runs 5 to 10 reads, searches, diffs, and compression steps inside a QuickJS sandbox, then returns one result.
No API key, account, or cloud service. Setup registers Distill with Claude Code and installs the local integrations.
What changes in a session.
Large outputs arrive compressed
Build logs, diffs, and stack traces take less context before Claude Code reads them.
Source reads become targeted
Claude Code can request a skeleton, function, or search result instead of loading the entire file.
Multi-step workflows return one result
Intermediate outputs stay inside the sandbox instead of accumulating after every tool call.
The integration stays verifiable
[DISTILL:COMPRESSED] marker, doctor command, PreCompact hook, subagent, and slash commands.
Add Distill to Claude Code.
Run setup, restart Claude Code, and all three tools are available.