File size: 8,124 Bytes
06bb098 cb2fef0 06bb098 68b32f4 ba511ba 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 b4983e1 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 270123c 06bb098 270123c 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 68b32f4 06bb098 ac18b1c 06bb098 a4ba2d1 06bb098 a4ba2d1 06bb098 a4ba2d1 06bb098 6914bc9 06bb098 ac18b1c 68b32f4 6914bc9 06bb098 68b32f4 06bb098 68b32f4 06bb098 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 |
---
title: Continuous Thought Machine - Energy based Halting
emoji: π°οΈ
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: "20.10.21"
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# π°οΈ The Continuous Thought Machine
π [PAPER: Technical Report](https://arxiv.org/abs/2505.05522) | π [Blog](https://sakana.ai/ctm/) | πΉοΈ [Interactive Website](https://pub.sakana.ai/ctm) | βοΈ [Tutorial](examples/01_mnist.ipynb)
## Overview
The **Continuous Thought Machine (CTM)** is a novel neural architecture designed to unfold and leverage neural activity as the underlying mechanism for observation and action. By introducing an internal temporal axis decoupled from input data, CTM enables neurons to process information over time with fine-grained temporal dynamics.
### Key Contributions
1. **Internal Temporal Axis**: Decoupled from input data, allowing neuron activity to unfold independently
2. **Neuron-Level Temporal Processing**: Each neuron uses unique weight parameters to process a history of incoming signals
3. **Neural Synchronisation**: Direct latent representation for modulating data and producing outputs, encoding information in the timing of neural activity
The CTM demonstrates strong performance across diverse tasks including ImageNet classification, 2D maze solving, sorting, parity computation, question-answering, and reinforcement learning.
---
## π¬ Energy-Based Halting Experiment
This repository includes an implementation of **Energy-Based Halting**, a mechanism that frames "thinking" as an optimization process where the model dynamically adjusts its internal thought process duration based on sample difficulty.
### Concept
Instead of using heuristic certainty thresholds, we train a learned energy scalar that:
- **Minimizes energy** for correct predictions (pushing the system to low-energy equilibrium)
- **Maximizes energy** for incorrect predictions (pushing away from stable states)
- **Enables adaptive halting** based on energy thresholds or convergence
### Implementation
**Modified Components:**
- `models/ctm.py`: Added energy projection head that maps synchronization states to scalar energy values
- `utils/losses.py`: Implemented `EnergyContrastiveLoss` for training the energy function
- `tasks/image_classification/train_energy.py`: Training script with energy halting
- `inference_energy.py`: Adaptive inference that halts when energy drops below threshold or stabilizes
- `configs/energy_experiment.yaml`: Configuration for energy experiments
**Training:**
```bash
# Local training
pixi run accelerate launch tasks/image_classification/train_energy.py \
--energy_head_enabled \
--loss_type energy_contrastive \
--dataset cifar10
# Or with traditional python
pixi run python tasks/image_classification/train_energy.py \
--energy_head_enabled \
--loss_type energy_contrastive
```
**Deployment to Hugging Face:**
See [GUIDE_HF.md](GUIDE_HF.md) for instructions on deploying the training job to Hugging Face Spaces with GPU support.
---
## π Quick Start
### Setup with Pixi (Recommended)
We use [Pixi](https://pixi.sh) for dependency management, which handles both Python packages and system dependencies like `ffmpeg`.
```bash
# Install dependencies
pixi install
# Run training
pixi run python tasks/image_classification/train.py
```
### Alternative: Conda Setup
```bash
conda create --name=ctm python=3.12
conda activate ctm
pip install -r requirements.txt
conda install -c conda-forge ffmpeg
```
If there are PyTorch version issues:
```bash
pip uninstall torch
pip install torch --index-url https://download.pytorch.org/whl/cu121
```
---
## π Repository Structure
```
βββ tasks/
β βββ image_classification/
β β βββ train.py # Standard training
β β βββ train_energy.py # Energy halting training
β β βββ analysis/run_imagenet_analysis.py
β β βββ plotting.py
β βββ mazes/
β β βββ train.py
β β βββ analysis/
β βββ sort/
β βββ parity/
β βββ qamnist/
β βββ rl/
βββ models/
β βββ ctm.py # Main CTM model (with energy head support)
β βββ modules.py # Neuron-level models, Synapse UNET
β βββ ff.py # Feed-forward baseline
β βββ lstm.py # LSTM baseline
βββ utils/
β βββ losses.py # Loss functions (includes EnergyContrastiveLoss)
β βββ schedulers.py
β βββ housekeeping.py
βββ data/
β βββ custom_datasets.py
βββ configs/
β βββ energy_experiment.yaml # Energy halting hyperparameters
βββ inference_energy.py # Adaptive energy-based inference
βββ Dockerfile # For HF Spaces deployment
βββ GUIDE_HF.md # Hugging Face deployment guide
βββ checkpoints/ # Model checkpoints
```
---
## π― Model Training
Each task has dedicated training code designed for ease-of-use and collaboration. Training scripts include reasonable defaults, with paper-replicating configurations in accompanying script folders.
### Image Classification Example
```bash
# Standard CTM training
python -m tasks.image_classification.train
# Energy halting training
python -m tasks.image_classification.train_energy \
--energy_head_enabled \
--loss_type energy_contrastive
```
### VSCode Debug Configuration
```json
{
"name": "Debug: train image classifier",
"type": "debugpy",
"request": "launch",
"module": "tasks.image_classification.train",
"console": "integratedTerminal",
"justMyCode": false
}
```
---
## π Analysis & Visualization
Analysis and plotting code to replicate paper figures is provided in `tasks/.../analysis/*`.
**Note:** `ffmpeg` is required for generating videos:
```bash
conda install -c conda-forge ffmpeg
# or with pixi (already included)
pixi install
```
---
## π¦ Checkpoints and Data
Download pre-trained checkpoints and datasets:
- **Checkpoints**: [Google Drive](https://drive.google.com/drive/folders/1vSg8T7FqP-guMDk1LU7_jZaQtXFP9sZg)
- **Maze Data**: [Google Drive](https://drive.google.com/file/d/1cBgqhaUUtsrll8-o2VY42hPpyBcfFv86/view?usp=drivesdk)
Place checkpoints in the `checkpoints/` folder following the structure `checkpoints/{task}/...`
---
## π€ Hugging Face Integration
This repository includes full support for training on Hugging Face infrastructure:
- **Accelerate**: Multi-GPU and mixed precision training
- **Hub Integration**: Automatic checkpoint uploading
- **Spaces Deployment**: Run training jobs on GPU Spaces
See [GUIDE_HF.md](GUIDE_HF.md) for detailed instructions.
---
## π Interactive Resources
- **[Interactive Website](https://pub.sakana.ai/ctm)**: Maze-solving demo, videos, and visualizations
- **[Paper](https://arxiv.org/abs/2505.05522)**: Technical details and experiments
- **[Blog](https://sakana.ai/ctm/)**: High-level overview and insights
- **[Tutorial Notebook](examples/01_mnist.ipynb)**: Hands-on introduction
---
## π Citation
### The Continuous Thought Machine : Energy-Based Halting Extension
This repository contains experimental extensions for Energy-Based Halting developed by **Uday Phalak**.
```bibtex
@misc{ctmenergy2025,
title={Energy-Based Halting for Continuous Thought Machines},
author={Phalak, Uday},
year={2025},
note={Experimental Extension of Continuous Thought Machines}
}
```
Based on The Continuous Thought Machine
```bibtex
@article{ctm2025,
title={The Continuous Thought Machine},
author={Darlow, Luke and Regan, Ciaran and Risi, Sebastian and Seely, Jeffrey and Jones, Llion},
journal={arXiv preprint arXiv:2505.05522},
year={2025}
}
```
---
## π License
This project is released under the MIT License. See LICENSE file for details.
|