File size: 8,124 Bytes
06bb098
cb2fef0
06bb098
 
 
 
 
 
 
 
 
 
 
68b32f4
 
ba511ba
68b32f4
06bb098
68b32f4
06bb098
68b32f4
06bb098
68b32f4
06bb098
 
 
b4983e1
06bb098
68b32f4
06bb098
68b32f4
06bb098
68b32f4
06bb098
270123c
06bb098
270123c
06bb098
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68b32f4
06bb098
 
 
 
 
 
 
 
 
 
 
68b32f4
 
06bb098
 
 
 
 
 
 
 
 
 
 
 
 
 
68b32f4
06bb098
 
68b32f4
06bb098
 
 
 
68b32f4
 
 
06bb098
68b32f4
 
06bb098
 
 
68b32f4
 
 
 
06bb098
68b32f4
06bb098
68b32f4
 
06bb098
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68b32f4
06bb098
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68b32f4
06bb098
 
 
 
68b32f4
06bb098
 
 
 
 
 
68b32f4
 
 
06bb098
68b32f4
06bb098
68b32f4
06bb098
 
 
 
 
ac18b1c
06bb098
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4ba2d1
06bb098
a4ba2d1
 
 
 
 
 
 
 
 
 
06bb098
a4ba2d1
06bb098
 
 
6914bc9
06bb098
 
 
ac18b1c
68b32f4
6914bc9
 
 
06bb098
68b32f4
06bb098
68b32f4
06bb098
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
---
title: Continuous Thought Machine - Energy based Halting
emoji: πŸ•°οΈ
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: "20.10.21"
app_file: app.py
pinned: false
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# πŸ•°οΈ The Continuous Thought Machine

πŸ“š [PAPER: Technical Report](https://arxiv.org/abs/2505.05522) | πŸ“ [Blog](https://sakana.ai/ctm/) | πŸ•ΉοΈ [Interactive Website](https://pub.sakana.ai/ctm) | ✏️ [Tutorial](examples/01_mnist.ipynb)

## Overview

The **Continuous Thought Machine (CTM)** is a novel neural architecture designed to unfold and leverage neural activity as the underlying mechanism for observation and action. By introducing an internal temporal axis decoupled from input data, CTM enables neurons to process information over time with fine-grained temporal dynamics.

### Key Contributions

1. **Internal Temporal Axis**: Decoupled from input data, allowing neuron activity to unfold independently
2. **Neuron-Level Temporal Processing**: Each neuron uses unique weight parameters to process a history of incoming signals
3. **Neural Synchronisation**: Direct latent representation for modulating data and producing outputs, encoding information in the timing of neural activity

The CTM demonstrates strong performance across diverse tasks including ImageNet classification, 2D maze solving, sorting, parity computation, question-answering, and reinforcement learning.

---

## πŸ”¬ Energy-Based Halting Experiment

This repository includes an implementation of **Energy-Based Halting**, a mechanism that frames "thinking" as an optimization process where the model dynamically adjusts its internal thought process duration based on sample difficulty.

### Concept

Instead of using heuristic certainty thresholds, we train a learned energy scalar that:

- **Minimizes energy** for correct predictions (pushing the system to low-energy equilibrium)
- **Maximizes energy** for incorrect predictions (pushing away from stable states)
- **Enables adaptive halting** based on energy thresholds or convergence

### Implementation

**Modified Components:**

- `models/ctm.py`: Added energy projection head that maps synchronization states to scalar energy values
- `utils/losses.py`: Implemented `EnergyContrastiveLoss` for training the energy function
- `tasks/image_classification/train_energy.py`: Training script with energy halting
- `inference_energy.py`: Adaptive inference that halts when energy drops below threshold or stabilizes
- `configs/energy_experiment.yaml`: Configuration for energy experiments

**Training:**

```bash
# Local training
pixi run accelerate launch tasks/image_classification/train_energy.py \
    --energy_head_enabled \
    --loss_type energy_contrastive \
    --dataset cifar10

# Or with traditional python
pixi run python tasks/image_classification/train_energy.py \
    --energy_head_enabled \
    --loss_type energy_contrastive
```

**Deployment to Hugging Face:**
See [GUIDE_HF.md](GUIDE_HF.md) for instructions on deploying the training job to Hugging Face Spaces with GPU support.

---

## πŸš€ Quick Start

### Setup with Pixi (Recommended)

We use [Pixi](https://pixi.sh) for dependency management, which handles both Python packages and system dependencies like `ffmpeg`.

```bash
# Install dependencies
pixi install

# Run training
pixi run python tasks/image_classification/train.py
```

### Alternative: Conda Setup

```bash
conda create --name=ctm python=3.12
conda activate ctm
pip install -r requirements.txt
conda install -c conda-forge ffmpeg
```

If there are PyTorch version issues:

```bash
pip uninstall torch
pip install torch --index-url https://download.pytorch.org/whl/cu121
```

---

## πŸ“ Repository Structure

```
β”œβ”€β”€ tasks/
β”‚   β”œβ”€β”€ image_classification/
β”‚   β”‚   β”œβ”€β”€ train.py                    # Standard training
β”‚   β”‚   β”œβ”€β”€ train_energy.py             # Energy halting training
β”‚   β”‚   β”œβ”€β”€ analysis/run_imagenet_analysis.py
β”‚   β”‚   └── plotting.py
β”‚   β”œβ”€β”€ mazes/
β”‚   β”‚   β”œβ”€β”€ train.py
β”‚   β”‚   └── analysis/
β”‚   β”œβ”€β”€ sort/
β”‚   β”œβ”€β”€ parity/
β”‚   β”œβ”€β”€ qamnist/
β”‚   └── rl/
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ ctm.py                          # Main CTM model (with energy head support)
β”‚   β”œβ”€β”€ modules.py                      # Neuron-level models, Synapse UNET
β”‚   β”œβ”€β”€ ff.py                           # Feed-forward baseline
β”‚   └── lstm.py                         # LSTM baseline
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ losses.py                       # Loss functions (includes EnergyContrastiveLoss)
β”‚   β”œβ”€β”€ schedulers.py
β”‚   └── housekeeping.py
β”œβ”€β”€ data/
β”‚   └── custom_datasets.py
β”œβ”€β”€ configs/
β”‚   └── energy_experiment.yaml          # Energy halting hyperparameters
β”œβ”€β”€ inference_energy.py                 # Adaptive energy-based inference
β”œβ”€β”€ Dockerfile                          # For HF Spaces deployment
β”œβ”€β”€ GUIDE_HF.md                         # Hugging Face deployment guide
└── checkpoints/                        # Model checkpoints
```

---

## 🎯 Model Training

Each task has dedicated training code designed for ease-of-use and collaboration. Training scripts include reasonable defaults, with paper-replicating configurations in accompanying script folders.

### Image Classification Example

```bash
# Standard CTM training
python -m tasks.image_classification.train

# Energy halting training
python -m tasks.image_classification.train_energy \
    --energy_head_enabled \
    --loss_type energy_contrastive
```

### VSCode Debug Configuration

```json
{
  "name": "Debug: train image classifier",
  "type": "debugpy",
  "request": "launch",
  "module": "tasks.image_classification.train",
  "console": "integratedTerminal",
  "justMyCode": false
}
```

---

## πŸ” Analysis & Visualization

Analysis and plotting code to replicate paper figures is provided in `tasks/.../analysis/*`.

**Note:** `ffmpeg` is required for generating videos:

```bash
conda install -c conda-forge ffmpeg
# or with pixi (already included)
pixi install
```

---

## πŸ“¦ Checkpoints and Data

Download pre-trained checkpoints and datasets:

- **Checkpoints**: [Google Drive](https://drive.google.com/drive/folders/1vSg8T7FqP-guMDk1LU7_jZaQtXFP9sZg)
- **Maze Data**: [Google Drive](https://drive.google.com/file/d/1cBgqhaUUtsrll8-o2VY42hPpyBcfFv86/view?usp=drivesdk)

Place checkpoints in the `checkpoints/` folder following the structure `checkpoints/{task}/...`

---

## πŸ€— Hugging Face Integration

This repository includes full support for training on Hugging Face infrastructure:

- **Accelerate**: Multi-GPU and mixed precision training
- **Hub Integration**: Automatic checkpoint uploading
- **Spaces Deployment**: Run training jobs on GPU Spaces

See [GUIDE_HF.md](GUIDE_HF.md) for detailed instructions.

---

## πŸ“– Interactive Resources

- **[Interactive Website](https://pub.sakana.ai/ctm)**: Maze-solving demo, videos, and visualizations
- **[Paper](https://arxiv.org/abs/2505.05522)**: Technical details and experiments
- **[Blog](https://sakana.ai/ctm/)**: High-level overview and insights
- **[Tutorial Notebook](examples/01_mnist.ipynb)**: Hands-on introduction

---

## πŸ™ Citation
### The Continuous Thought Machine : Energy-Based Halting Extension

This repository contains experimental extensions for Energy-Based Halting developed by **Uday Phalak**.

```bibtex
@misc{ctmenergy2025,
  title={Energy-Based Halting for Continuous Thought Machines},
  author={Phalak, Uday},
  year={2025},
  note={Experimental Extension of Continuous Thought Machines}
}
```

Based on The Continuous Thought Machine 
```bibtex
@article{ctm2025,
  title={The Continuous Thought Machine},
  author={Darlow, Luke and Regan, Ciaran and Risi, Sebastian and Seely, Jeffrey and Jones, Llion},
  journal={arXiv preprint arXiv:2505.05522},
  year={2025}
}
```




---

## πŸ“ License

This project is released under the MIT License. See LICENSE file for details.