Update README.md

c05ddb1 verified 2 months ago

6.59 kB

	---
	license: apache-2.0
	---
	# MobiusNet

	A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating.

	## Overview

	MobiusNet introduces a fundamentally different approach to neural network design:

	- MobiusLens: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU)
	- Thirds Mask: Cantor-inspired fractal channel suppression for regularization
	- Continuous Topology: Layers sample a continuous manifold via the `t` parameter, not discrete units
	- Twist Rotations: Smooth rotation through representation space across network depth
	- Integrator: The integrator uses GELU in experimentation to enable additional GELU-based nonlinearity.

	## Performance

	\| Model \| Params \| GFLOPs \| Tiny ImageNet \|
	\|-------\|--------\|--------\|---------------\|
	\| MobiusNet-Base \| 33.7M \| 2.69 \| TBD \|

	## Installation

	```bash
	pip install torch torchvision safetensors huggingface_hub tensorboard tqdm
	```

	## Quick Start

	### Training

	```python
	from mobius_trainer_full import train_tiny_imagenet

	model, best_acc = train_tiny_imagenet(
	preset='mobius_base',
	epochs=200,
	lr=3e-4,
	batch_size=128,
	use_integrator=True,
	data_dir='./data/tiny-imagenet-200',
	output_dir='./outputs',
	hf_repo='AbstractPhil/mobiusnet',
	save_every_n_epochs=10,
	upload_every_n_epochs=10,
	)
	```

	### Continue from Checkpoint

	```python
	# From local directory
	model, best_acc = train_tiny_imagenet(
	preset='mobius_base',
	epochs=200,
	continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000",
	)

	# From HuggingFace (auto-downloads)
	model, best_acc = train_tiny_imagenet(
	preset='mobius_base',
	epochs=200,
	continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000",
	)
	```

	### Inference

	```python
	from safetensors.torch import load_file
	from mobius_trainer_full import MobiusNet, PRESETS

	# Load model
	config = PRESETS['mobius_base']
	model = MobiusNet(num_classes=200, use_integrator=True, **config)
	state_dict = load_file("best_model.safetensors")
	model.load_state_dict(state_dict)
	model.eval()

	# Inference
	with torch.no_grad():
	logits = model(image_tensor)
	pred = logits.argmax(1)
	```

	## Model Presets

	\| Preset \| Channels \| Depths \| ~Params \|
	\|--------\|----------\|--------\|---------\|
	\| `mobius_tiny_s` \| (64, 128, 256) \| (2, 2, 2) \| 500K \|
	\| `mobius_tiny_m` \| (64, 128, 256, 512, 768) \| (2, 2, 4, 2, 2) \| 11M \|
	\| `mobius_tiny_l` \| (96, 192, 384, 768) \| (3, 3, 3, 3) \| 8M \|
	\| `mobius_base` \| (128, 256, 512, 768, 1024) \| (2, 2, 2, 2, 2) \| 33.7M \|

	## Architecture

	```
	Input
	│
	▼
	┌─────────────────────────────────┐
	│ Stem (Conv → BN) │
	└─────────────────────────────────┘
	│
	▼
	┌─────────────────────────────────┐
	│ Stage 1-N │
	│ ┌─────────────────────────────┐ │
	│ │ MobiusConvBlock (×depth) │ │
	│ │ ├─ Depthwise-Sep Conv │ │
	│ │ ├─ BatchNorm │ │
	│ │ ├─ MobiusLens (wave gate) │ │
	│ │ ├─ Thirds Mask │ │
	│ │ └─ Learned Residual │ │
	│ └─────────────────────────────┘ │
	│ Downsample (stride-2 conv) │
	└─────────────────────────────────┘
	│
	▼
	┌─────────────────────────────────┐
	│ Integrator (Conv → BN → GELU) │ ← Task collapse
	└─────────────────────────────────┘
	│
	▼
	┌─────────────────────────────────┐
	│ Pool → Linear → Classes │
	└─────────────────────────────────┘
	```

	## Core Components

	### MobiusLens

	Wave-based gating mechanism with three interference paths:

	```python
	L = wave(phase_l, drift_l) # Left path (+1 drift)
	M = wave(phase_m, drift_m) # Middle path (0 drift, ghost)
	R = wave(phase_r, drift_r) # Right path (-1 drift)

	# Interference
	xor_comp = \|L + R - 2LR\| # Differentiable XOR
	and_comp = L * R # Differentiable AND

	# Gating
	gate = weighted_sum(L, M, R) * interference_blend
	output = input * sigmoid(layernorm(gate))
	```

	The middle path (M) acts as a "ghost" — present but diminished — maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure).

	### Thirds Mask

	Rotating channel suppression inspired by Cantor set construction:

	```
	Layer 0: suppress channels [0:C/3]
	Layer 1: suppress channels [C/3:2C/3]
	Layer 2: suppress channels [2C/3:C]
	Layer 3: back to [0:C/3]
	```

	Forces redundancy and prevents co-adaptation across channel groups.

	### Continuous Topology

	Each layer samples a continuous manifold:

	```python
	t = layer_idx / (total_layers - 1) # 0 → 1

	twist_in_angle = t * π
	twist_out_angle = -t * π
	scales = scale_range[0] + t * scale_span
	```

	Adding layers = finer sampling of the same underlying structure.

	## Checkpoints

	Saved to: `checkpoints/{variant}_{dataset}/{timestamp}/`

	```
	├── config.json
	├── best_accuracy.json
	├── final_accuracy.json
	├── checkpoints/
	│ ├── checkpoint_epoch_0010.pt
	│ ├── checkpoint_epoch_0010.safetensors
	│ ├── best_model.pt
	│ ├── best_model.safetensors
	│ ├── final_model.pt
	│ └── final_model.safetensors
	└── tensorboard/
	```

	## TensorBoard

	Monitor training:

	```bash
	tensorboard --logdir ./outputs/checkpoints
	```

	Tracks:
	- Loss, train/val accuracy
	- Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights)
	- Residual weights
	- Weight histograms

	## Data Setup

	### Tiny ImageNet

	```bash
	wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
	unzip tiny-imagenet-200.zip -d ./data/
	```

	## License

	Apache 2.0

	## Citation

	```bibtex
	@misc{mobiusnet2026,
	title={MobiusNet: Wave-Based Topological Vision Architecture},
	author={AbstractPhil},
	year={2026},
	url={https://huggingface.co/AbstractPhil/mobiusnet}
	}
	```