Claude Context: Your Entire Codebase as Context
The biggest bottleneck in AI-assisted coding isn't model intelligence — it's context. Your Claude Code session starts blank every time. You paste files, point at directories, beg the agent to understand your codebase. It forgets half of it by turn three. Claude Context fixes this permanently.
Built by Zilliz (the team behind Milvus — the most widely deployed open-source vector database), Claude Context is an MCP plugin that indexes your entire codebase into a vector store and gives any AI coding agent semantic search across all of it. Not grep. Not file globbing. Understanding your code.
The Problem It Solves
Consider a typical workflow with Claude Code on a mid-size project:
- "Read src/auth.ts" — agent reads one file
- "Now read lib/database.ts" — agent reads another file
- "Check the middleware" — agent reads three more files
- Half your context window is gone before you've even asked your actual question
- Agent gives an answer that misses the edge case in
src/utils/validation.tsbecause you never asked it to read that file
This is the discovery problem. On large codebases (100K+ lines), you can't manually tell the agent what to read. You don't know what's relevant yourself. The agent needs to find the relevant code — and that's exactly what Claude Context does.
Instead of loading entire directories (expensive, wasteful, context-heavy), Claude Context stores your codebase in a vector database and only retrieves the code that actually matters for your query.
How It Works: Hybrid Search
Claude Context doesn't rely on a single search strategy. It uses hybrid search — combining two complementary approaches:
BM25 (Keyword Search)
The classic full-text search algorithm. Fast, exact, great for finding specific function names, variable references, and string literals. If you search for "handleJWT," BM25 will find every file that literally contains those characters.
Dense Vector Search (Semantic Search)
Embeds your code chunks into high-dimensional vectors using an embedding model (OpenAI, VoyageAI, Ollama, or Gemini). When you search "find functions that handle user authentication," it understands the meaning — and returns code that does auth, even if it never uses the word "authentication."
Why Hybrid Matters
BM25 alone misses semantic matches. Vector search alone misses exact keyword matches. Together, they cover both cases. The evaluation shows this hybrid approach achieves equivalent retrieval quality to loading everything — but at ~40% of the token cost.
The Architecture
Claude Context is a monorepo with three core packages:
| Package | Purpose |
|---|---|
@zilliz/claude-context-core | Indexing engine — embeddings, chunking, vector DB integration |
@zilliz/claude-context-mcp | MCP server — connects to Claude Code, Cursor, and 15+ clients |
| VS Code Extension | Native IDE integration with GUI for search and navigation |
The key technical decisions that make this work:
AST-Based Code Chunking
Most naive approaches split code by line count or character count. Claude Context parses your code into an Abstract Syntax Tree first, then chunks along logical boundaries — functions, classes, methods. This means each chunk is a coherent unit of code, not a random slice that cuts a function in half.
Merkle Tree Incremental Indexing
Re-indexing a million-line codebase from scratch every time you change a file would be insane. Claude Context uses Merkle trees to track file states — when you re-index, it only processes changed files. If you touch one file in a 500-file project, only that file gets re-embedded and updated.
Zilliz Cloud Vector Database
The vector store is Milvus (via Zilliz Cloud). This isn't a toy SQLite-backed vector search — it's the same engine that powers production search at scale. Free tier available, which is more than enough for most projects. The alternative is self-hosted Milvus if you need full control.
Setup: 60 Seconds to Running
The setup is genuinely simple. Three prerequisites, one command:
Prerequisites
- Node.js 20-23 (not 24 — they're explicit about this)
- OpenAI API key — for the embedding model (text-embedding-3-small by default)
- Zilliz Cloud account — free tier, takes 30 seconds to set up
Install
claude mcp add claude-context \ -e OPENAI_API_KEY=sk-your-openai-api-key \ -e MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint \ -e MILVUS_TOKEN=your-zilliz-cloud-api-key \ -- npx @zilliz/claude-context-mcp@latest
Use
# Index your codebase claude > Index this codebase # Check progress > Check the indexing status # Search with natural language > Find functions that handle user authentication > Where is the database connection pool configured? > Show me the error handling middleware
That's it. No build steps, no config files to hand-edit (unless you want to), no database servers to run locally. The free Zilliz Cloud tier handles everything.
The 15+ Integrations
Claude Context isn't locked to Claude Code. It's an MCP server — which means it works with anything that speaks MCP. The officially documented integrations:
| Client | Type | Config Format |
|---|---|---|
| Claude Code | CLI Agent | Claude MCP CLI |
| OpenAI Codex CLI | CLI Agent | TOML |
| Gemini CLI | CLI Agent | JSON |
| Qwen Code | CLI Agent | JSON |
| Cursor | IDE | MCP JSON |
| Windsurf | IDE | MCP JSON |
| VS Code | IDE | Extension + MCP |
| Cline | IDE Extension | MCP JSON |
| Augment | IDE Extension | UI + JSON |
| Roo Code | IDE Extension | MCP JSON |
| Zencoder | IDE Extension | UI Config |
| Cherry Studio | Desktop App | GUI |
| Void | Desktop App | Settings |
| Claude Desktop | Desktop App | MCP JSON |
| LangChain/LangGraph | Framework | Python SDK |
Every single one uses the same underlying MCP server. The config format changes per client, but the server is always npx @zilliz/claude-context-mcp@latest.
The Four Tools
Claude Context exposes exactly four MCP tools — no bloat, no feature creep:
| Tool | What It Does |
|---|---|
index_codebase | Index a directory for hybrid search (BM25 + dense vector). Incremental by default — only re-indexes changed files. |
search_code | Natural language search against the indexed codebase. Returns ranked results with file paths, line ranges, and relevance scores. |
clear_index | Wipe the index for a specific codebase. Useful when you've restructured heavily. |
get_indexing_status | Check progress — shows percentage for active indexing, completion status for indexed codebases. |
Four tools. That's the entire API surface. Compare this to the multi-round file-reading dance you'd normally do. The simplicity is the point.
The Numbers: 40% Token Reduction
Zilliz ran a controlled evaluation comparing three approaches:
- No context — agent works blind (baseline)
- Full loading — dump everything into context (expensive, but high quality)
- Claude Context — semantic retrieval (target: match full loading quality at lower cost)
The result: Claude Context achieved equivalent retrieval quality to full loading while using ~40% fewer tokens. That's not a marginal improvement — that's cutting your API bill nearly in half while getting the same (or better) answers.
Why better? Because under the constraint of a limited context window, loading everything means relevant code gets pushed out by irrelevant code. Claude Context only loads what matters, so the agent has more room for your actual conversation and the code it needs.
What Makes This Different
The code search space is not empty. Here's how Claude Context compares:
vs. Grep / ripgrep
Grep is exact keyword matching. It's fast, free, and built into everything. But it can't do semantic search. "Find the function that validates JWT tokens" — grep looks for those exact words. Claude Context understands what validation means and finds the code even if it says verifyAccessToken().
vs. grep-like MCP tools (e.g., search_files)
Most Claude Code setups include a basic file search tool. These are regex-based — useful for finding strings, useless for understanding intent. Claude Context adds the semantic layer on top.
vs. acontext / codebase-context
Several community tools index codebases for Claude. Most use simple embeddings without hybrid search, naive line-based chunking, or require running a local vector database. Claude Context's advantages: AST-based chunking, BM25 + vector hybrid, Merkle incremental indexing, and a managed vector DB (no local infra).
vs. loading everything with /compact
Some workflows involve loading the entire codebase and relying on Claude's context management. This works for small projects (< 50 files). Beyond that, you're burning tokens on code the agent will never reference, and pushing relevant code out of the window.
vs. manual file discovery
The default Claude Code experience: you tell it what to read, it reads it, it forgets half of it. Works for small tasks, falls apart on cross-cutting changes that touch 10+ files across different directories.
Supported Languages
Claude Context's AST splitter supports 16 languages:
| Category | Languages |
|---|---|
| Web | TypeScript, JavaScript, PHP, Ruby |
| Systems | Rust, C++, C, Go, Swift, Kotlin, Scala |
| JVM | Java, Kotlin, Scala |
| Data | Python |
| Docs | Markdown |
| Mobile | Swift, Kotlin |
For unsupported languages, it falls back to LangChain's character-based splitter — not as smart as AST chunking, but it works.
Embedding Provider Flexibility
You're not locked into OpenAI for embeddings. Claude Context supports four providers:
| Provider | Model | Cost | Notes |
|---|---|---|---|
| OpenAI | text-embedding-3-small / large | Pay per token | Default. Best quality-to-speed ratio. |
| VoyageAI | voyage-code-3 | Pay per token | Specifically trained for code. Excellent results. |
| Gemini | text-embedding-004 | Pay per token | Google's embedding model. Good multilingual support. |
| Ollama | nomic-embed-text (local) | Free | Run locally. No API key needed. Fully private. |
The Ollama option is notable — you can run the entire pipeline locally with zero API costs. Index locally, embed locally, search locally. The only external dependency is the vector database (Zilliz Cloud free tier or self-hosted Milvus).
⚡ The Verdict
Claude Context is the most practical solution to the codebase context problem we've seen. It doesn't try to be clever — it takes a well-understood approach (hybrid vector search) and applies it correctly to a real workflow pain point.
The 40% token reduction claim is backed by evaluation, not marketing. The AST-based chunking is the right approach (not line-count hackery). The Merkle tree incremental indexing means it doesn't punish you for re-indexing after small changes. And the free Zilliz Cloud tier removes the infrastructure barrier entirely.
9.3k stars, 713 forks, 174 commits, active development by a team that builds vector databases for a living. MIT licensed. One claude mcp add command and you're live. If your codebase doesn't fit in context, install this.
✅ Pros
- Hybrid search (BM25 + vector) — best of both worlds
- AST-based chunking — logical code units, not random slices
- Merkle incremental indexing — only re-indexes changes
- 15+ IDE/agent integrations via MCP
- Free tier (Zilliz Cloud + Ollama embeddings)
- Built by the Milvus team — they know vector search
- 40% token reduction with equivalent quality
- MIT licensed, 9.3k stars
- Four embedding providers including local Ollama
- VS Code native extension available
⚠️ Cons
- Requires Node.js 20-23 (no 24 support yet)
- Needs external vector DB (Zilliz Cloud or Milvus)
- OpenAI API key required for default embeddings
- No self-contained local mode without Milvus
- Indexing large codebases takes time on first run
- AST splitter limited to 16 languages
- Core team is Zilliz (vendor lock-in risk for vector DB)
Get Started
# 1. Get free Zilliz Cloud account → zilliz.com/cloud # 2. Get embedding API key (OpenAI, VoyageAI, Gemini, or Ollama) # 3. Add MCP server: claude mcp add claude-context \ -e OPENAI_API_KEY=sk-your-key \ -e MILVUS_ADDRESS=your-endpoint \ -e MILVUS_TOKEN=your-token \ -- npx @zilliz/claude-context-mcp@latest # 4. Open Claude Code, index, search: claude > Index this codebase > Find functions that handle user authentication
Repository: github.com/zilliztech/claude-context
VS Code Extension: Semantic Code Search on Marketplace
Docs: DeepWiki AI Documentation
Discord: Zilliz Community