Instructions to use jspaulsen/halluci-mate-v1a with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jspaulsen/halluci-mate-v1a with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jspaulsen/halluci-mate-v1a")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jspaulsen/halluci-mate-v1a") model = AutoModelForCausalLM.from_pretrained("jspaulsen/halluci-mate-v1a") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jspaulsen/halluci-mate-v1a with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jspaulsen/halluci-mate-v1a" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jspaulsen/halluci-mate-v1a", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jspaulsen/halluci-mate-v1a
- SGLang
How to use jspaulsen/halluci-mate-v1a with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jspaulsen/halluci-mate-v1a" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jspaulsen/halluci-mate-v1a", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jspaulsen/halluci-mate-v1a" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jspaulsen/halluci-mate-v1a", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jspaulsen/halluci-mate-v1a with Docker Model Runner:
docker model run hf.co/jspaulsen/halluci-mate-v1a
halluci-mate-v1a
Alpha release. A chess LLM trained from scratch on the Lichess dataset using the Qwen3-0.6B architecture and a custom UCI move tokenizer. Expect rough edges — move quality, strategy, and robustness are all unvalidated beyond basic smoke tests.
Source: https://github.com/jspaulsen/halluci-mate
Model details
- Architecture: Qwen3 (
Qwen3ForCausalLM), ~0.6B parameters- 28 layers, hidden size 1024, 16 attention heads (8 KV heads), intermediate size 3072
bfloat16, tied word embeddings, RoPE θ = 1,000,000
- Vocabulary: 1,974 tokens — 6 special tokens (
<PAD>,<UNK>,<EOS>,<WHITE>,<BLACK>,<DRAW>) + ~1,792 geometric UCI moves + 176 promotion moves - Context: 32,768 tokens
- Checkpoint:
runs-v1/marvelous-deer-608/checkpoint-9660
Tokenizer
The tokenizer is custom and is not loadable via AutoTokenizer.from_pretrained. It is defined in src/halluci_mate/chess_tokenizer.py in the source repo. Install the package and use ChessTokenizer() directly.
Inputs are conditioned on the side-to-move winning: each game is prefixed with <WHITE> or <BLACK> (or <DRAW>), followed by the sequence of UCI moves.
Usage
import chess
import torch
from transformers import AutoModelForCausalLM
from halluci_mate.chess_tokenizer import ChessTokenizer
from halluci_mate.game.game import Game
from halluci_mate.inference import ChessInferenceEngine
engine = ChessInferenceEngine.from_checkpoint(
"jspaulsen/halluci-mate-v1a",
constrained=True, # mask logits to legal moves
temperature=0.0, # greedy
)
game = Game(board=chess.Board(), condition="<WHITE>")
move = engine.predict(game)
print(move.uci())
Constrained decoding masks the logits to the set of legal UCI moves in the current position, which eliminates illegal-move hallucinations at the cost of potentially hiding model weaknesses. Unconstrained sampling (constrained=False) will occasionally produce illegal tokens — this is expected for an alpha.
Training
- Data: Lichess games, filtered to
Normaltermination, SAN parsed to UCI withpython-chess - Model initialized from config (no pretrained weights) via
AutoModelForCausalLM.from_config - Training script:
scripts/train.pyin the source repo
Hyperparameters, dataset size, and eval metrics are not finalized — see TODO.md in the source repo.
Limitations
- Alpha quality; move strength has not been benchmarked against a rated engine
- Constrained decoding is recommended for any real use — the raw model may emit illegal move tokens
- Trained on human games, so idiosyncrasies and blunders at lower ratings are reflected in behavior
- No support for analyzing positions from arbitrary FENs beyond what
Gameconstructs
License
MIT. See the source repo for details.
- Downloads last month
- 19