Same model · two ways of seeing your code

Same brain.
A fraction of the context.

We held the model constant — Claude Opus 4.8 — and gave it your codebase two ways: raw tools, or Mesh. Twelve tests. Here's what changes.

Opus 4.8 + raw toolsvsOpus 4.8 + Mesh
01

9 of every 10 tools never get sent

Mesh hands the model only the handful of tools a task actually needs — not all 122 — so every turn starts lighter.

−81%prompt size
122 tools available → only ~8 sent per turn
02

Your whole repo, pocket-sized

Instead of raw source, Mesh keeps a compact capsule of each file. Watch the codebase collapse into it.

20.4×smaller
756K raw tokens → 37K across 252 files
03

Answers without reading the whole file

A raw agent pours in whole files. Mesh trickles in the one snippet that answers the question.

−92%context per question
raw pours 124.6K tokens · Mesh trickles 9.9K
04

Finds the right code, first try

Across 50 questions, the exact right function is the very first result 94% of the time — always within the top three.

94%right on the first hit
0%TOP-1 HIT RATE
05

A needle in a haystack of code

A 3-D wall of code with one fact buried inside. Mesh pulls the exact needle out every time. Drag to look around.

100%found · reads 1,609× less
50 facts buried in code · every one recovered
06

It won't make things up

When the supporting fact isn't found, raw keyword search guesses. Mesh surfaces the real source, so answers stay grounded.

~0%made-up answers · raw 38%
guessed“…persists sessions to Redis”38%
grounded“…stored in vectors.bin”~0%
07

Knows which file to fix

From a plain-English bug report, Mesh scans the tree and locks onto the file to change. Keyword search misses far more often.

94%right file · keyword 38%
src/
payments/
webhook.ts
invoices.ts← fix here
refunds.ts
auth/session.ts
08

Gets better the bigger your repo

A raw agent reads more as the codebase grows. Mesh fetches the same small snippet at any size — so the savings compound.

195×advantage on a large repo
small
medium
large
huge
context advantage vs repo size
09

Long sessions stay light

Over a 20-turn session a raw agent's context piles up turn after turn. Mesh dedupes and trims — so it stays flat while raw climbs.

14×lighter by turn 20
raw · 234K climbingMesh · 16K flat
10

Follows the trail across files

When the value you need lives in another file, Mesh fetches that definition. Watch it carry the value across.

100%vs keyword 33%
checkout.ts
14import { RATE_LIMIT }
15if (count > ?)
16  throw RateError()
config.ts
7export const
8RATE_LIMIT = 600
Mesh carries the definition across files
11

Works even when you mistype

Typos, abbreviations, different words — Mesh matches on meaning and keeps finding it. Keyword search loses every misspelt term.

steadykeyword: 32% → 22%
build the file cache
✓ Mesh matched
Mesh96%
keyword28%
12

Search by what code does

Describe it in your own words — sharing not a single term with the code — and Mesh still threads straight to the function.

85%found · keyword 18%
“remove repeated entries, keep the order” → dedupeByOrder() · zero shared words

One model. Twelve wins.

Same brain, a fraction of the context — and it finds code where a keyword scan can't. That's what Mesh changes.

Read the docs
Each test holds the model constant — Claude Opus 4.8 — and compares raw POSIX tools (read / grep / keyword search) against the same model with Mesh.