Harmonic Convergence: Mamba-3 PRIME Baremetal
This is a 300M parameter Mamba-3 architecture trained exclusively using the discrete PRIME lattice optimizer (integer voting).
⚠️ CRITICAL WARNING: Do NOT attempt to load this model using transformers or AutoModelForCausalLM. This model uses custom discrete integer weights (uint16_t mappings to a harmonic prime LUT) instead of standard FP32 gradients. Standard PyTorch/HF loaders will crash or load random noise.
This repository is designed for baremetal execution. The model has been exported to a highly compressed monolithic .bin file, optimized for AVX-512 integer-indexing in pure C.
Files Included
prime_mamba3_25000.bin: The monolithic, fully-trained model weights (Step 25,000). Highly compressed (769MB) usinguint16_tindices.prime_inference.c: The baremetal C inference wrapper thatmmaps the.binfile.prime_kernel.c: The core AVX-512 C kernel for executing the PRIME discrete integer matrix multiplications.build_kernel.sh: Compilation instructions for the C environment.
Baremetal Execution
To run the model natively on a CPU using the included AVX-512 kernel:
# 1. Compile the baremetal C engine
gcc -O3 -march=native -mavx512f -mavx512bw -mavx512dq -fopenmp -ffast-math prime_kernel.c prime_inference.c -o prime_inference -lm
# 2. Execute against the monolithic binary
./prime_inference prime_mamba3_25000.bin
Binary Layout Structure
For developers building custom bootloaders or OS kernels (e.g., llm-baremetal-interactive.img), the prime_mamba3_25000.bin file follows this contiguous memory layout:
- Header (256 bytes): Contains
0x5052494D("PRIM") magic number, andConfigstruct (d_model,n_layers,vocab_size,lut_size). - LUT: 65,536
float32prime harmonic points. - Embeddings:
vocab_size * d_modelstandardfloat32. - Layers 0-27: Interleaved standard weights (
float32) and compressed discrete weights (uint16_tforin_projandout_proj).
Training Context
This model was trained to syntactically lock onto C/C++ architecture for Operating System Homeostasis generation. It successfully leverages discrete integer updates (SUPERMAJORITY voting) to prevent vanishing gradients over 25,000 steps.