You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Sentience.Cascade.II

Recursive Language Model (RLM) · Hybrid Mind Frame 1.147B Parameters · 64K Context Window · Dual T4 Trained


Overview

Sentience.Cascade.II is not a Large Language Model (LLM).
It is a Recursive Language Model (RLM) — a novel architecture where every forward pass includes multiple self-recursive refinement steps, episodic short and long-term memory, and a fully wired Hybrid Mind module that runs as one integrated frame, not as sequential pipeline stages.

All cognitive subsystems operate inside a single unified forward pass.


Architecture

Component Detail
Architecture type Recursive Language Model (RLM)
Parameters ~1.147B
Context window 64,000 tokens
Attention Grouped Query Attention (16 heads / 4 KV heads)
Positional encoding RoPE (θ=500,000)
FFN SwiGLU
Normalisation RMSNorm
Weight format safetensors (float32 on disk, bfloat16 for training)
Vocabulary 65,536 (BPE ByteLevel)

Hybrid Mind Frame — Self-Automated (S.A.) Modules

All modules are active simultaneously inside each transformer layer. None are optional pipeline steps — they are weights baked into the model.

Module Role
S.A. Meta Learning Gate Scales activation magnitude as a proxy learning signal
S.A. Reinforcement Learning Head Scalar reward prediction per forward pass
S.A. Continual Learning Gate Soft forgetting-protection via decay gates
S.A. Adaptive Learning Scale Per-token hidden-state scaling
S.A. Rewrite Gate Token-level hidden-state rewriting delta
S.A. NLP Head Span boundary logits for structured extraction
S.A. Problem Solving Head 8-class step-type classification
S.A. Innovation Noise Trainable exploration noise (active during training only)
S.A. Debug Probe 4-class anomalous activation detector
S.A. Advanced Short-Term Memory 512-slot episodic rolling buffer
S.A. Advanced Long-Term Memory 1024-slot consolidated episodic store
S.A. Recursive Seed Learning Multi-step (×4) recursive refinement loop
S.A. Self-Evaluation & Reward Scalar self-score head
S.A. Goal & Constraint Engine Residual goal-projection delta
S.A. Memory Consolidation Automatic STM→LTM every 8 layers
S.A. Introspection Interface 64-dim interpretable summary of hidden state
S.A. Recursive Outer Loop Gate Final gate before residual output
Conversational Intelligence 32-class dialog-act classification head
MultiModal (Text/Image/Audio/Video) Linear projection from ViT-L / mel-spec / video dims

Recursive Language Model Core

Unlike a standard transformer that processes tokens once per layer, Sentience.Cascade.II applies a RecursiveSeedLayer after all transformer blocks. This layer runs num_recursive_steps=4 passes of attention + FFN with a shared-weight inner loop, allowing the model to internally "think again" before producing logits.

This is the defining feature of the RLM architecture:

Output is not produced after one pass — it is refined recursively.


Memory System

  • Short-Term Memory (512 slots): Updated every forward pass via a write gate.
    Cross-attended by every layer, giving the model persistent intra-context state.
  • Long-Term Memory (1024 slots): Consolidated from short-term every 8 layers via a separate consolidation gate with 0.99/0.01 EMA blend.
    Persists across training steps when fine-tuning.

Multimodal Support

Three input projection heads accept external embeddings:

Modality Input dim Projection
Image 1024 (ViT-L patch) Linear → 2048
Audio 128 (mel-spectrogram) Linear → 2048
Video 1024 (frame embedding) Linear → 2048

These are additive prefix embeddings — concatenate modality tokens before input_ids.


Chat Template

<|system|>You are Sentience.Cascade.II, a recursive reasoning model.
<|user|>What is consciousness?
<|assistant|>

Fine-Tuning

This is the base pretrained initialisation — weights are randomly initialised and the tokenizer is bootstrapped. Fine-tune on your domain corpus using standard causal-LM training.

Recommended fine-tune config:

from transformers import TrainingArguments

args = TrainingArguments(
    output_dir           = "./sc2-finetuned",
    per_device_train_batch_size = 1,
    gradient_accumulation_steps = 16,
    num_train_epochs     = 3,
    learning_rate        = 2e-4,
    lr_scheduler_type    = "cosine",
    warmup_ratio         = 0.03,
    bf16                 = True,
    gradient_checkpointing = True,
    save_strategy        = "steps",
    save_steps           = 500,
    logging_steps        = 10,
    report_to            = "none",
)

Note: Because SentienceCascadeModel is a custom architecture, you will need to register it with the HuggingFace AutoModel registry or load it with trust_remote_code=True after placing the model code in the repo.


Citation

@misc{sentiencecascade2,
  author       = {GODsStrongestSoldier},
  title        = {Sentience.Cascade.II: A Recursive Language Model with Hybrid Mind Frame},
  year         = {2025},
  publisher    = {HuggingFace},
  howpublished = {\url{https://huggingface.co/GODsStrongestSoldier/Sentience.Cascade.II}},
}

License

Apache 2.0

Downloads last month
15
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including WithinUsAI/Sentience.Cascade.II