Cofos Code Model ({MODEL_VERSION}) โ€” SparseMind 500M

Cofos v2 is a 500M-parameter code model built on AMFORGE's SparseMind v15 architecture. Same essence as Cofos v1 (296M @ 34% real_syntax_valid), scaled larger and trained with multilingual instructions + chain-of-thought.

Developed by {ORGANIZATION}.

Architecture (SparseMind v15)

Parameters

  • dim={cfg.dim} (v1: 768), n_layers={cfg.n_layers}, n_heads={cfg.n_heads} (head_dim={cfg.dim // cfg.n_heads} โ€” same as v1)
  • max_seq_len={cfg.max_seq_len} (v1: 512), vocab_size={cfg.vocab_size}
  • channel_top_k={cfg.channel_top_k}, token_top_k={cfg.token_top_k} (same sparsity ratios as v1)
  • Total parameters: {model.n_params:,}

Training data (3-way mix)

  • 30% real HF Python (iamtarun/python_code_instructions_18k_alpaca)

Result

  • Best real_syntax_valid: {best_syntax:.1f}% on held-out real Python instructions

Tokenizer

How to use

import torch
import sentencepiece as spm

# Load checkpoint
ckpt = torch.load("cofos_best.pt", map_location="cpu")
cfg_dict = ckpt["config"]

# Instantiate model architecture
# model = SparseMind(Config(**cfg_dict))
# model.load_state_dict(ckpt["model"])
# model.eval()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support