matrix-v6 / README.md
varunmathur's picture
initial Matrix v6 index β€” 446,897-cap Qwen3-Embedding-0.6B retrieval
50c6631 verified
---
language: en
library_name: hyperspace-matrix-v6
tags:
- retrieval
- capability-routing
- agent-skills
license: apache-2.0
---
# Matrix v6 β€” Hyperspace capability retrieval index
Retrieval index for **446,897 agent capabilities** (tools, skills, agents) drawn
from the skills.sh ecosystem (387 K SKILL.md files across 7,029 GitHub repos)
plus Hyperspace's curated tool/agent catalog.
The retriever itself is `Qwen/Qwen3-Embedding-0.6B` used **pretrained, no
fine-tune** β€” a deliberate choice documented in
[MATRIX_V6_ARCHITECTURE.md](https://github.com/hyperspaceai/agentic-os-prod/blob/main/docs/MATRIX_V6_ARCHITECTURE.md).
## Contents
| File | What |
|---|---|
| `capability_embeddings.fp16.npy` | `[446897, 1024]` fp16 embeddings, L2-normalized |
| `capability_names.json` | Per-row metadata (cap_id, name, kind, repo) |
| `capability_clusters.json` | cap_id β†’ cluster_id (190,085 intent clusters) |
| `capabilities.jsonl` | Full source rows (name, description, snippet) |
| `config.json` | Index metadata: retriever base, dim, query prompt, pooling |
## Quick eval (5 K held-out test queries, brute-force cosine)
| Metric | v6 | v5 backbone same setup |
|---|---|---|
| ret@1 | 16.6 % | 6.0 % |
| ret@5 | 47.3 % | 15.6 % |
| ret@10 | 57.4 % | 18.8 % |
| cluster@1 | 51.6 % | 16.0 % |
| MRR@20 (cluster) | 0.56 | 0.18 |
**~3 Γ— across every metric, with a backbone 0.4 Γ— the size** (596 M vs 1.5 B).
## Usage
```python
from huggingface_hub import snapshot_download
import numpy as np, json, torch
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModel
# 1. pull the index
path = snapshot_download("hyperspaceai/matrix-v6")
emb = np.load(f"{path}/capability_embeddings.fp16.npy")
names = json.load(open(f"{path}/capability_names.json"))
# 2. load the (pretrained) retriever
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
tok.padding_side = "left"
if tok.pad_token is None: tok.pad_token = tok.eos_token
model = AutoModel.from_pretrained("Qwen/Qwen3-Embedding-0.6B",
torch_dtype=torch.bfloat16).cuda().eval()
# 3. embed query
QP = ("Instruct: Given a user request, retrieve relevant agent "
"capabilities (tools / skills / agents) from the catalog.\nQuery: ")
def encode(text):
enc = tok(QP + text + tok.eos_token, return_tensors="pt",
truncation=True, max_length=192).to("cuda")
with torch.no_grad():
h = model(**enc).last_hidden_state
return F.normalize(h[:, -1, :].float(), dim=-1).cpu().numpy()
# 4. retrieve top-10
q = encode("write a SQL query that finds the top 10 customers by revenue")
sims = q @ emb.astype(np.float32).T
top10 = sims[0].argsort()[::-1][:10]
for i in top10:
print(f"{sims[0, i]:.3f} {names[i]['name']} ({names[i]['repo']})")
```
## License
Index/embeddings: Apache-2.0. Each SKILL.md remains under its source repository's
license β€” see the `repo` field in `capability_names.json`.
The Matrix v6 corpus + scripts are in the Hyperspace agentic-os-prod monorepo;
the harvester is `thor-services/skills-sh-harvest.js`.