Multi-index code search that combines semantic understanding, keyword matching, and structural awareness — at enterprise scale.
Keyword search finds text, not meaning. “How does authentication work?” returns nothing useful — the code says r.URL.Query(), not “authentication.”
Pure vector search captures intent but fumbles exact lookups. “Find semverParse” should be a precise hit, not a fuzzy semantic match.
Enterprise codebases have 100K+ files. IDE search is local. GitHub search is rate-limited. No tool combines understanding with scale.
Developers spend 58% of their time understanding code, not writing it. The retrieval problem is the bottleneck.
OCTO Platform fuses dense vector embeddings, BM25 keyword search, and LLM-generated summaries into a single retrieval pipeline. Ask anything — symbol lookups, architectural questions, concept searches.
A deterministic query router uses a 5-pass heuristic to detect symbol names, keyword prefixes, CamelCase identifiers, and comprehension queries — no LLM needed at query time.
Three plugins fire simultaneously: QdrantCode (chunk embeddings), QdrantSummary (LLM file summaries), and ConceptCluster (topic groupings). Results fuse via Reciprocal Rank Fusion.
Dedup, score normalization, optional cross-encoder reranking (ms-marco-MiniLM), and top-K filtering. The best results surface regardless of which index found them.
5-pass symbol extraction — classifies queries into hybrid or semantic search mode
Runs all SearchIndexPlugins concurrently with timeout budgets
Dedup → ScoreNorm → CrossEncoder → TopK filtering pipeline
Dense vector search over code chunks. 768-dim embeddings via nomic-embed-text. Text indexes for symbol-level BM25 filtering.
Searches LLM-generated file descriptions. Bridges the gap between code tokens and conceptual queries like “How does URL handling work?”
Topic-level groupings for cross-file concept retrieval. Surfaces related files that individual chunk search would miss.
Raw code embeddings capture tokens but not purpose. Summary embeddings solve this by having an LLM describe what each file does at index time, then embedding those descriptions.
“How does the code handle URL parameters?” lands in a different vector neighborhood than r.URL.Query() in macaron.go. Pure code embeddings can't bridge this semantic gap.
At index time, gemma2:2b generates a plain-English summary of each file. At query time, comprehension queries match summaries and the correct files surface.
Measured impact: +5 comprehension hits (+12 percentage points) from summary search alone. Combined with cross-encoder reranking: +7 hits total, reaching 81% comprehension accuracy.
Function lookups: correct file is usually the #1 result.
Features measured to hurt (query expansion: −4 hits; chunk enrichment: −3 hits) were disabled. Only proven improvements shipped.
Primary metric includes questions whose expected files weren't indexed — no cherry-picking.
nomic-embed-text embeddings (768-dim)1,566-line evaluation engine. Golden set format with function_lookup, class_lookup, and comprehension question types.
Query classification is fully deterministic — 5-pass heuristic regex. LLMs are only used at index time for summaries.
--append flag resumes interrupted runs. --file-list for targeted re-indexing. Summary generation: ~1.5 hours for 11K files.
OCTO Platform powers the Amplifier octo-search tool — giving AI agents the ability to search any indexed codebase with natural language, find exact symbols, and understand architecture.
JSON output mode, REST API, and an Amplifier tool module. Agents don't grep blindly — they search with understanding.
Index once, search forever. Multiple collections for different repos. Concept clusters reveal cross-file relationships. The Taste Library lets teams encode their own conventions into search ranking.
From “Where is semverParse defined?” to “How does JWT signing work?” — one system, one query, the right files.
Primary source: OCTOPlatform/ repository on disk at /home/samschillace/dev/ANext/OCTOPlatform.
index-repo.py (827 lines), server.py (725 lines), search.py (542 lines), generate-summaries.py, run-eval.py, check-env.py — ~3,215 lines totalGit history: 37 commits, sole contributor samschillace. Commit range: April 17–21, 2026.
Eval metrics: Directly from README.md — 542-question golden set against Grafana (11K files, ~110K chunks). Hit rates, MRR, and feature flag impacts are from the documented eval results history table.
Line counts: Computed via find + wc -l on source files, excluding __pycache__, .worktrees, and node_modules.
Team knowledge: amplifier-module-tool-octo-search capability confirmed via team knowledge search.
Deck generated: May 2026. All numbers are from the repository as of the latest commit (af4cbaa).