initial Matrix v6 index — 446,897-cap Qwen3-Embedding-0.6B retrieval

50c6631 verified 11 days ago

3.1 kB

	---
	language: en
	library_name: hyperspace-matrix-v6
	tags:
	- retrieval
	- capability-routing
	- agent-skills
	license: apache-2.0
	---

	# Matrix v6 — Hyperspace capability retrieval index

	Retrieval index for 446,897 agent capabilities (tools, skills, agents) drawn
	from the skills.sh ecosystem (387 K SKILL.md files across 7,029 GitHub repos)
	plus Hyperspace's curated tool/agent catalog.

	The retriever itself is `Qwen/Qwen3-Embedding-0.6B` used **pretrained, no
	fine-tune** — a deliberate choice documented in
	[MATRIX_V6_ARCHITECTURE.md](https://github.com/hyperspaceai/agentic-os-prod/blob/main/docs/MATRIX_V6_ARCHITECTURE.md).

	## Contents

	\| File \| What \|
	\|---\|---\|
	\| `capability_embeddings.fp16.npy` \| `[446897, 1024]` fp16 embeddings, L2-normalized \|
	\| `capability_names.json` \| Per-row metadata (cap_id, name, kind, repo) \|
	\| `capability_clusters.json` \| cap_id → cluster_id (190,085 intent clusters) \|
	\| `capabilities.jsonl` \| Full source rows (name, description, snippet) \|
	\| `config.json` \| Index metadata: retriever base, dim, query prompt, pooling \|

	## Quick eval (5 K held-out test queries, brute-force cosine)

	\| Metric \| v6 \| v5 backbone same setup \|
	\|---\|---\|---\|
	\| ret@1 \| 16.6 % \| 6.0 % \|
	\| ret@5 \| 47.3 % \| 15.6 % \|
	\| ret@10 \| 57.4 % \| 18.8 % \|
	\| cluster@1 \| 51.6 % \| 16.0 % \|
	\| MRR@20 (cluster) \| 0.56 \| 0.18 \|

	~3 × across every metric, with a backbone 0.4 × the size (596 M vs 1.5 B).

	## Usage

	```python
	from huggingface_hub import snapshot_download
	import numpy as np, json, torch
	import torch.nn.functional as F
	from transformers import AutoTokenizer, AutoModel

	# 1. pull the index
	path = snapshot_download("hyperspaceai/matrix-v6")
	emb = np.load(f"{path}/capability_embeddings.fp16.npy")
	names = json.load(open(f"{path}/capability_names.json"))

	# 2. load the (pretrained) retriever
	tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
	tok.padding_side = "left"
	if tok.pad_token is None: tok.pad_token = tok.eos_token
	model = AutoModel.from_pretrained("Qwen/Qwen3-Embedding-0.6B",
	torch_dtype=torch.bfloat16).cuda().eval()

	# 3. embed query
	QP = ("Instruct: Given a user request, retrieve relevant agent "
	"capabilities (tools / skills / agents) from the catalog.\nQuery: ")
	def encode(text):
	enc = tok(QP + text + tok.eos_token, return_tensors="pt",
	truncation=True, max_length=192).to("cuda")
	with torch.no_grad():
	h = model(**enc).last_hidden_state
	return F.normalize(h[:, -1, :].float(), dim=-1).cpu().numpy()

	# 4. retrieve top-10
	q = encode("write a SQL query that finds the top 10 customers by revenue")
	sims = q @ emb.astype(np.float32).T
	top10 = sims[0].argsort()[::-1][:10]
	for i in top10:
	print(f"{sims[0, i]:.3f} {names[i]['name']} ({names[i]['repo']})")
	```

	## License

	Index/embeddings: Apache-2.0. Each SKILL.md remains under its source repository's
	license — see the `repo` field in `capability_names.json`.

	The Matrix v6 corpus + scripts are in the Hyperspace agentic-os-prod monorepo;
	the harvester is `thor-services/skills-sh-harvest.js`.