Claude Context: Your Entire Codebase as Context

Published: April 25, 2026 Tags: Developer Tools MCP Vector Search Read time: 10 min

9.3k

GitHub Stars

~40%

Token Reduction

15+

IDE / Agent Integrations

Languages Supported

The biggest bottleneck in AI-assisted coding isn't model intelligence — it's context. Your Claude Code session starts blank every time. You paste files, point at directories, beg the agent to understand your codebase. It forgets half of it by turn three. Claude Context fixes this permanently.

Built by Zilliz (the team behind Milvus — the most widely deployed open-source vector database), Claude Context is an MCP plugin that indexes your entire codebase into a vector store and gives any AI coding agent semantic search across all of it. Not grep. Not file globbing. Understanding your code.

The Problem It Solves

Consider a typical workflow with Claude Code on a mid-size project:

"Read src/auth.ts" — agent reads one file
"Now read lib/database.ts" — agent reads another file
"Check the middleware" — agent reads three more files
Half your context window is gone before you've even asked your actual question
Agent gives an answer that misses the edge case in src/utils/validation.ts because you never asked it to read that file

This is the discovery problem. On large codebases (100K+ lines), you can't manually tell the agent what to read. You don't know what's relevant yourself. The agent needs to find the relevant code — and that's exactly what Claude Context does.

Instead of loading entire directories (expensive, wasteful, context-heavy), Claude Context stores your codebase in a vector database and only retrieves the code that actually matters for your query.

How It Works: Hybrid Search

Claude Context doesn't rely on a single search strategy. It uses hybrid search — combining two complementary approaches:

BM25 (Keyword Search)

The classic full-text search algorithm. Fast, exact, great for finding specific function names, variable references, and string literals. If you search for "handleJWT," BM25 will find every file that literally contains those characters.

Dense Vector Search (Semantic Search)

Embeds your code chunks into high-dimensional vectors using an embedding model (OpenAI, VoyageAI, Ollama, or Gemini). When you search "find functions that handle user authentication," it understands the meaning — and returns code that does auth, even if it never uses the word "authentication."

Why Hybrid Matters

BM25 alone misses semantic matches. Vector search alone misses exact keyword matches. Together, they cover both cases. The evaluation shows this hybrid approach achieves equivalent retrieval quality to loading everything — but at ~40% of the token cost.

The Architecture

Claude Context is a monorepo with three core packages:

Package	Purpose
`@zilliz/claude-context-core`	Indexing engine — embeddings, chunking, vector DB integration
`@zilliz/claude-context-mcp`	MCP server — connects to Claude Code, Cursor, and 15+ clients
VS Code Extension	Native IDE integration with GUI for search and navigation

The key technical decisions that make this work:

AST-Based Code Chunking

Most naive approaches split code by line count or character count. Claude Context parses your code into an Abstract Syntax Tree first, then chunks along logical boundaries — functions, classes, methods. This means each chunk is a coherent unit of code, not a random slice that cuts a function in half.

Merkle Tree Incremental Indexing

Re-indexing a million-line codebase from scratch every time you change a file would be insane. Claude Context uses Merkle trees to track file states — when you re-index, it only processes changed files. If you touch one file in a 500-file project, only that file gets re-embedded and updated.

Zilliz Cloud Vector Database

The vector store is Milvus (via Zilliz Cloud). This isn't a toy SQLite-backed vector search — it's the same engine that powers production search at scale. Free tier available, which is more than enough for most projects. The alternative is self-hosted Milvus if you need full control.

Setup: 60 Seconds to Running

The setup is genuinely simple. Three prerequisites, one command:

Prerequisites

Node.js 20-23 (not 24 — they're explicit about this)
OpenAI API key — for the embedding model (text-embedding-3-small by default)
Zilliz Cloud account — free tier, takes 30 seconds to set up

Install

claude mcp add claude-context \
  -e OPENAI_API_KEY=sk-your-openai-api-key \
  -e MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint \
  -e MILVUS_TOKEN=your-zilliz-cloud-api-key \
  -- npx @zilliz/claude-context-mcp@latest

Use

# Index your codebase
claude
> Index this codebase

# Check progress
> Check the indexing status

# Search with natural language
> Find functions that handle user authentication
> Where is the database connection pool configured?
> Show me the error handling middleware

That's it. No build steps, no config files to hand-edit (unless you want to), no database servers to run locally. The free Zilliz Cloud tier handles everything.

💡 No OpenAI API key? Claude Context also supports VoyageAI, Ollama (local embeddings — fully free), and Gemini as embedding providers. You can run the entire pipeline without paying OpenAI a cent.

The 15+ Integrations

Claude Context isn't locked to Claude Code. It's an MCP server — which means it works with anything that speaks MCP. The officially documented integrations:

Client	Type	Config Format
Claude Code	CLI Agent	Claude MCP CLI
OpenAI Codex CLI	CLI Agent	TOML
Gemini CLI	CLI Agent	JSON
Qwen Code	CLI Agent	JSON
Cursor	IDE	MCP JSON
Windsurf	IDE	MCP JSON
VS Code	IDE	Extension + MCP
Cline	IDE Extension	MCP JSON
Augment	IDE Extension	UI + JSON
Roo Code	IDE Extension	MCP JSON
Zencoder	IDE Extension	UI Config
Cherry Studio	Desktop App	GUI
Void	Desktop App	Settings
Claude Desktop	Desktop App	MCP JSON
LangChain/LangGraph	Framework	Python SDK

Every single one uses the same underlying MCP server. The config format changes per client, but the server is always npx @zilliz/claude-context-mcp@latest.

The Four Tools

Claude Context exposes exactly four MCP tools — no bloat, no feature creep:

Tool	What It Does
`index_codebase`	Index a directory for hybrid search (BM25 + dense vector). Incremental by default — only re-indexes changed files.
`search_code`	Natural language search against the indexed codebase. Returns ranked results with file paths, line ranges, and relevance scores.
`clear_index`	Wipe the index for a specific codebase. Useful when you've restructured heavily.
`get_indexing_status`	Check progress — shows percentage for active indexing, completion status for indexed codebases.

Four tools. That's the entire API surface. Compare this to the multi-round file-reading dance you'd normally do. The simplicity is the point.

The Numbers: 40% Token Reduction

Zilliz ran a controlled evaluation comparing three approaches:

No context — agent works blind (baseline)
Full loading — dump everything into context (expensive, but high quality)
Claude Context — semantic retrieval (target: match full loading quality at lower cost)

The result: Claude Context achieved equivalent retrieval quality to full loading while using ~40% fewer tokens. That's not a marginal improvement — that's cutting your API bill nearly in half while getting the same (or better) answers.

Why better? Because under the constraint of a limited context window, loading everything means relevant code gets pushed out by irrelevant code. Claude Context only loads what matters, so the agent has more room for your actual conversation and the code it needs.

💡 Need a provider for Claude Code? Among our favorites is Z.AI — their coding plans cover GLM 5.1 and GLM 5 Turbo, both available as API endpoints you can plug into any terminal agent. Use this link for 10% off any Z.AI coding plan.

What Makes This Different

The code search space is not empty. Here's how Claude Context compares:

vs. Grep / ripgrep

Grep is exact keyword matching. It's fast, free, and built into everything. But it can't do semantic search. "Find the function that validates JWT tokens" — grep looks for those exact words. Claude Context understands what validation means and finds the code even if it says verifyAccessToken().

vs. grep-like MCP tools (e.g., search_files)

Most Claude Code setups include a basic file search tool. These are regex-based — useful for finding strings, useless for understanding intent. Claude Context adds the semantic layer on top.

vs. acontext / codebase-context

Several community tools index codebases for Claude. Most use simple embeddings without hybrid search, naive line-based chunking, or require running a local vector database. Claude Context's advantages: AST-based chunking, BM25 + vector hybrid, Merkle incremental indexing, and a managed vector DB (no local infra).

vs. loading everything with /compact

Some workflows involve loading the entire codebase and relying on Claude's context management. This works for small projects (< 50 files). Beyond that, you're burning tokens on code the agent will never reference, and pushing relevant code out of the window.

vs. manual file discovery

The default Claude Code experience: you tell it what to read, it reads it, it forgets half of it. Works for small tasks, falls apart on cross-cutting changes that touch 10+ files across different directories.

Supported Languages

Claude Context's AST splitter supports 16 languages:

Category	Languages
Web	TypeScript, JavaScript, PHP, Ruby
Systems	Rust, C++, C, Go, Swift, Kotlin, Scala
JVM	Java, Kotlin, Scala
Data	Python
Docs	Markdown
Mobile	Swift, Kotlin

For unsupported languages, it falls back to LangChain's character-based splitter — not as smart as AST chunking, but it works.

Embedding Provider Flexibility

You're not locked into OpenAI for embeddings. Claude Context supports four providers:

Provider	Model	Cost	Notes
OpenAI	text-embedding-3-small / large	Pay per token	Default. Best quality-to-speed ratio.
VoyageAI	voyage-code-3	Pay per token	Specifically trained for code. Excellent results.
Gemini	text-embedding-004	Pay per token	Google's embedding model. Good multilingual support.
Ollama	nomic-embed-text (local)	Free	Run locally. No API key needed. Fully private.

The Ollama option is notable — you can run the entire pipeline locally with zero API costs. Index locally, embed locally, search locally. The only external dependency is the vector database (Zilliz Cloud free tier or self-hosted Milvus).

⚡ The Verdict

Claude Context is the most practical solution to the codebase context problem we've seen. It doesn't try to be clever — it takes a well-understood approach (hybrid vector search) and applies it correctly to a real workflow pain point.

The 40% token reduction claim is backed by evaluation, not marketing. The AST-based chunking is the right approach (not line-count hackery). The Merkle tree incremental indexing means it doesn't punish you for re-indexing after small changes. And the free Zilliz Cloud tier removes the infrastructure barrier entirely.

9.3k stars, 713 forks, 174 commits, active development by a team that builds vector databases for a living. MIT licensed. One claude mcp add command and you're live. If your codebase doesn't fit in context, install this.

✅ Pros

Hybrid search (BM25 + vector) — best of both worlds
AST-based chunking — logical code units, not random slices
Merkle incremental indexing — only re-indexes changes
15+ IDE/agent integrations via MCP
Free tier (Zilliz Cloud + Ollama embeddings)
Built by the Milvus team — they know vector search
40% token reduction with equivalent quality
MIT licensed, 9.3k stars
Four embedding providers including local Ollama
VS Code native extension available

⚠️ Cons

Requires Node.js 20-23 (no 24 support yet)
Needs external vector DB (Zilliz Cloud or Milvus)
OpenAI API key required for default embeddings
No self-contained local mode without Milvus
Indexing large codebases takes time on first run
AST splitter limited to 16 languages
Core team is Zilliz (vendor lock-in risk for vector DB)

Get Started

# 1. Get free Zilliz Cloud account → zilliz.com/cloud
# 2. Get embedding API key (OpenAI, VoyageAI, Gemini, or Ollama)
# 3. Add MCP server:

claude mcp add claude-context \
  -e OPENAI_API_KEY=sk-your-key \
  -e MILVUS_ADDRESS=your-endpoint \
  -e MILVUS_TOKEN=your-token \
  -- npx @zilliz/claude-context-mcp@latest

# 4. Open Claude Code, index, search:
claude
> Index this codebase
> Find functions that handle user authentication

Repository: github.com/zilliztech/claude-context

VS Code Extension: Semantic Code Search on Marketplace

Docs: DeepWiki AI Documentation

Discord: Zilliz Community

Disclosure: No affiliation with Zilliz. Just an AI infrastructure nerd who thinks this is how codebase context should work.