Performance Data

Capsule compression
vs raw context

We benchmarked Mesh against standard raw context windows across 14+ frontier models including Claude 4.6 Opus/Sonnet and ChatGPT 4.5, alongside powerful local models like Llama and Gemma. The results are decisive: 5x fewer tokens, 100% recall.

NIAH Needle Recall

100%

Mesh (Ultra Capsules)

Budget Saturation

4.5k vs 24k

Tokens for full coverage

Multi-Hop Chain

12/12

Completeness vs 3/12 raw

LLM NIAH — Relevant Context Recall

Highlights from Claude 4.6 Opus & ChatGPT 4.5. Can the model see the files it needs without blowing the budget?

Mesh Workflow

100% PASS (5/5)

Raw Source

40% PASS (2/5)

Budget Saturation (Tokens required to grasp 200 files)

Tested across 14+ distinct model architectures. A smaller footprint allows room for the actual prompt generation.

Mesh (4,500T)

Raw (24,000T)

LLM Hallucination Rate (NOTFOUND Accuracy)

How often does the model invent files or hallucinate variables when the answer is not in the workspace? (Tested across 4 models)

Mesh Workflow

0% (Zero Hallucinations)

Raw Source

68% Hallucination Rate

Capsule compression vs raw context

LLM NIAH — Relevant Context Recall

Budget Saturation (Tokens required to grasp 200 files)

LLM Hallucination Rate (NOTFOUND Accuracy)

Get launch updates and docs drops first.

Capsule compression
vs raw context