e5-base-v2-code-search (v9-200k)
A fine-tuned code search embedding model based on intfloat/e5-base-v2 (110M parameters). Trained with call-graph false-negative filtering on 200K balanced pairs across 9 programming languages.
Built for cqs — code intelligence and RAG for AI agents.
Key Results
| Eval | Metric | Score |
|---|---|---|
| Pipeline (55 confusable functions, enriched) | R@1 | 94.5% |
| Pipeline | MRR | 0.966 |
| Raw code embedding (no enrichment) | R@1 | 70.9% |
| CodeSearchNet (6 languages) | NDCG@10 | 0.615 |
The 94.5% pipeline score ties BGE-large (335M) at 1/3 the parameter count. The 70.9% raw R@1 exceeds BGE-large (61.8%).
Training
- Base model: intfloat/e5-base-v2 (110M params, 768 dimensions)
- Data: 200K balanced pairs (22,222 per language × 9 languages) from cqs-indexed Stack repos
- Key technique: Call-graph false-negative filtering — uses code structure (caller/callee relationships) to exclude structurally related functions from contrastive negatives. Zero API cost (SQLite lookup).
- Loss: CachedGISTEmbedLoss + MatryoshkaLoss (768/384/192/128 dims)
- LoRA: rank 16, alpha 32, targets: query, key, value, dense
- Epochs: 1 (more epochs degrades enrichment compatibility)
- Dataset: jamie8johnson/cqs-code-search-200k
The 89.1% Basin
Six independent perturbations from this configuration (more data, less data, FAISS hard negatives, more epochs, contrastive query augmentation) all produce exactly -5.4pp pipeline regression to 89.1%. The 94.5% result appears to occupy a narrow peak in the loss landscape around ~22K examples per language with CG filtering.
Usage with cqs
# Default model in cqs v1.9.0+
cqs init && cqs index
# Or specify explicitly
export CQS_EMBEDDING_MODEL=e5-base
cqs index
Usage with sentence-transformers
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("jamie8johnson/e5-base-v2-code-search")
query_emb = model.encode("query: find functions that validate email addresses")
code_emb = model.encode("passage: def validate_email(addr): ...")
Languages
Go, Java, JavaScript, PHP, Python, Ruby, Rust, TypeScript, C++
ONNX
Includes model.onnx for inference with ONNX Runtime (used by cqs for local GPU/CPU inference).
Citation
Paper in preparation. See research log for methodology.
- Downloads last month
- 208
Model tree for jamie8johnson/e5-base-v2-code-search
Base model
intfloat/e5-base-v2Dataset used to train jamie8johnson/e5-base-v2-code-search
Evaluation results
- NDCG@10 (avg, 6 languages) on CodeSearchNetself-reported0.615
- Pipeline R@1 on cqs Pipeline Eval (55 confusable functions)self-reported0.945
- MRR on cqs Pipeline Eval (55 confusable functions)self-reported0.966
- Raw R@1 (no enrichment) on cqs Pipeline Eval (55 confusable functions)self-reported0.709