NeuralHorner (bitserial-modmul-v8)
A single modulus-conditioned recurrent cell that computes (a * b) mod p across primes it never trained on,
by running one learned per-step transition inside a fixed bit-serial Horner loop.
- One bidirectional two-layer GRU cell (about 471K parameters), conditioned on the modulus
p. - It learns only the per-step transition
s' = (2s + d*x) mod p; the loop schedule (reducea, reduceb, multiply the two residues) is fixed by hand. The claim is the learned per-step transition and its cross-prime transfer, not discovery of the loop. - Dynamic-L inference sizes the per-step state width to each prime's bit-length (the dropped bits are always zero), which keeps every run under the time budget.
weights.pt md5: 8fc8ace7d74538b66ef5980b4e9cd013
Results (official open-source scorer, single rented H100 via RunPod)
- All ten scored tiers at exact-match
1.00(highest_tier_above_90 = 10), reproduced across three scorer-operand seeds, deterministic. Each full run completes in 163 to 174 seconds against a 300 second budget. - Cross-prime transfer:
480/480exact on fresh primes across 64 to 2048-bit widths. - Anti-cheat: randomizing the weights collapses every tier from
64/64to0/64, so the capability sits in the trained parameters, not a hand-coded circuit. - bf16 decision-safety: 0 flipped answers versus fp32 (
min |logit| = 3.017).
Scope and known limits
The model is not claimed to be exact, and where it is weak is stated plainly. A held-out adversarial battery
of six disjoint operand families (768 cases) scores 759/768; the failures concentrate at
power-of-two-adjacent (Fermat) operands, a single high-wrap transition. A Tier-0 pure-multiplication probe
(operands whose product is smaller than the modulus, so no reduction occurs) scores 40/100, so the claim is
scoped to modular multiplication on the scored distribution, not general large-integer multiplication. Full
ablations, the failure localization, and a machine-checked Lean proof of the integer algorithm are in the
paper and the code repository.
Usage
import importlib.util
spec = importlib.util.spec_from_file_location("model", "model.py")
m = importlib.util.module_from_spec(spec); spec.loader.exec_module(m)
model = m.BitSerialReducer()
model.load(".") # loads weights.pt from this directory
# inputs are (preprocess_a(a), preprocess_b(b), preprocess_p(p)); see model.py for the I/O contract
Links
- Code and paper: https://github.com/Robby955/neural-horner
Citation
@misc{robert_sneiderman_2026,
author = {Robert Sneiderman},
title = {bitserial-modmul-v8 (Revision b49812c)},
year = 2026,
url = {https://huggingface.co/TrickyRex/bitserial-modmul-v8},
doi = {10.57967/hf/9357},
publisher = {Hugging Face}
}
License: MIT, Copyright (c) 2026 Robert Sneiderman.