Introduction

If you run Claude Desktop, Claude Code, or any MCP-enabled client through the day, you are burning tokens you never asked for. Every MCP server call that returns a large payload — GitHub issues, Firecrawl scrapes, search results — dumps the entire response into context whether your agent needed three lines or three hundred thousand tokens worth of data.

jmunch-mcp is a transparent MCP proxy that sits in front of your existing MCP servers and fixes this without changing anything about how you work.

The Problem

Most MCP servers return everything and let the agent sort it out. Call the GitHub MCP server for a list of issues and you get a 379,000-token response in context. Call Firecrawl to scrape a page and 259,000 tokens land whether you needed a summary or the full content. The agent reads the whole pile. You pay for the whole pile. And because this happens on every call, the waste compounds fast across a working session.

This is not a bug in any specific MCP server — it is just how the protocol works by default. The upstream returns what it has. There was no layer in between to intercept and compress.

The Solution

jmunch-mcp is that layer. It wraps any existing MCP server, forwards every tool call to the upstream, and intercepts large responses before they hit your agent’s context. Instead of the full payload, the agent receives a handle — a lightweight reference it can query using a small set of universal verbs:

VerbWhat it does
peekReturn a summary or top N rows
sliceReturn a row range or a JSONPath expression
searchFull-text search over the payload
aggregateGroup-by, count, sum over tabular data
describeSchema and shape of the payload
list_handlesShow all active handles in the session

The agent drills into exactly what it needs. Everything else stays out of context.

How It Works

Install

pip install jmunch-mcp

Or from source:

git clone https://github.com/jgravelle/jmunch-mcp
cd jmunch-mcp
pip install -e .

Init — one-time setup

jmunch-mcp init

init scans three sources automatically — your MCP client configs (Claude Desktop, Claude Code, Cursor, Windsurf, Continue), running MCP processes on your machine, and a built-in catalog of popular upstreams (GitHub, Firecrawl, filesystem, fetch, Brave Search, Slack). It renders a checklist. Tick the upstreams you want wrapped. It rewrites your client config and writes one .toml per selection into ./configs/.

Non-interactive flags for automation:

FlagWhat it does
--yesAuto-select everything already registered in a client config
--dry-runShow what would change, write nothing
--overwriteOverwrite existing .toml configs
--out <dir>Write configs to a custom directory
--no-runningSkip scanning running processes
--no-catalogSkip the built-in upstream catalog

Content-aware routing

Under the hood, jmunch-mcp routes payloads based on their shape:

  • Tabular content (GitHub issues, PRs, commits) → SQLite backend → answers peek, slice, aggregate
  • JSON content (Firecrawl scrapes, sitemaps) → JSON-tree backend → answers peek, slice via JSONPath, search

Your agent never needs to know which backend handled it — the verbs are the same regardless.

Manual config

If you prefer wiring things up yourself instead of using init:

jmunch-mcp --config examples/config.toml

Point your MCP client at jmunch-mcp --config <path> instead of the upstream server directly. Add --report to print a token summary on session shutdown.

Auto-launch — nothing to run per session

Once init has rewritten your client config, jmunch-mcp spawns automatically as a subprocess every time your MCP client starts. There is nothing to launch manually before a session.

[i] INFO: On Windows with MSIX-packaged Claude Desktop, the app may read config from %LOCALAPPDATA%\Packages\AnthropicPBC.Claude_*\LocalCache\Roaming\Claude\ instead of the standard %APPDATA%\Claude\. If MCP tools do not appear after restart, check which path actually contains your mcpServers config.

Dashboard

jmunch-mcp ships a local read-only web UI that shows your cumulative token savings:

jmunch-mcp dashboard              # opens at http://127.0.0.1:7878
jmunch-mcp dashboard --open       # also launches your browser

Available flags:

FlagDefaultWhat it does
--port7878Change the listen port
--host127.0.0.1Change the bind address
--db <path>(auto)Point at a non-default metrics DB
--openAuto-launch browser

The dashboard shows three views — cumulative totals, per-upstream breakdowns, and a time series of every forwarded call. No cloud, no telemetry, everything local. Rows with zero savings are hidden. Metrics populate once you have made at least one proxied call — run a wrapped upstream first, then open the dashboard.

Results

Benchmarks from the repo, measured end-to-end with a fixed script of tool calls run direct vs proxied:

SuiteDirectvia jmunch-mcpSaved
GitHub — facebook/react issues/PRs/commits379,878 tok44,328 tok335,550 (88.3%)
Firecrawl — Wikipedia + sitemap + search259,574 tok2,928 tok256,646 (98.9%)

Wall-clock time also dropped — 19% faster on GitHub, 44% faster on Firecrawl — because the agent never pages through data it did not need.

For anyone running MCP-heavy workflows all day, the savings compound across every session. Wrap once, save forever.

Repo: https://github.com/jgravelle/jmunch-mcp — MIT licensed.

[i] Tested on: Windows 11, Claude Desktop (MSIX), Python 3.12, jmunch-mcp v0.0.3


Questions or feedback? Reach out on LinkedIn or leave a comment below.

🤝 Connect with Me

Found this useful? I write about PowerShell, Windows infrastructure, and enterprise automation.