| --- |
| license: apache-2.0 |
| --- |
| # MobiusNet |
|
|
| A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating. |
|
|
| ## Overview |
|
|
| MobiusNet introduces a fundamentally different approach to neural network design: |
|
|
| - **MobiusLens**: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU) |
| - **Thirds Mask**: Cantor-inspired fractal channel suppression for regularization |
| - **Continuous Topology**: Layers sample a continuous manifold via the `t` parameter, not discrete units |
| - **Twist Rotations**: Smooth rotation through representation space across network depth |
| - **Integrator**: The integrator uses GELU in experimentation to enable additional GELU-based nonlinearity. |
|
|
| ## Performance |
|
|
| | Model | Params | GFLOPs | Tiny ImageNet | |
| |-------|--------|--------|---------------| |
| | MobiusNet-Base | 33.7M | 2.69 | TBD | |
|
|
| ## Installation |
|
|
| ```bash |
| pip install torch torchvision safetensors huggingface_hub tensorboard tqdm |
| ``` |
|
|
| ## Quick Start |
|
|
| ### Training |
|
|
| ```python |
| from mobius_trainer_full import train_tiny_imagenet |
| |
| model, best_acc = train_tiny_imagenet( |
| preset='mobius_base', |
| epochs=200, |
| lr=3e-4, |
| batch_size=128, |
| use_integrator=True, |
| data_dir='./data/tiny-imagenet-200', |
| output_dir='./outputs', |
| hf_repo='AbstractPhil/mobiusnet', |
| save_every_n_epochs=10, |
| upload_every_n_epochs=10, |
| ) |
| ``` |
|
|
| ### Continue from Checkpoint |
|
|
| ```python |
| # From local directory |
| model, best_acc = train_tiny_imagenet( |
| preset='mobius_base', |
| epochs=200, |
| continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000", |
| ) |
| |
| # From HuggingFace (auto-downloads) |
| model, best_acc = train_tiny_imagenet( |
| preset='mobius_base', |
| epochs=200, |
| continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000", |
| ) |
| ``` |
|
|
| ### Inference |
|
|
| ```python |
| from safetensors.torch import load_file |
| from mobius_trainer_full import MobiusNet, PRESETS |
| |
| # Load model |
| config = PRESETS['mobius_base'] |
| model = MobiusNet(num_classes=200, use_integrator=True, **config) |
| state_dict = load_file("best_model.safetensors") |
| model.load_state_dict(state_dict) |
| model.eval() |
| |
| # Inference |
| with torch.no_grad(): |
| logits = model(image_tensor) |
| pred = logits.argmax(1) |
| ``` |
|
|
| ## Model Presets |
|
|
| | Preset | Channels | Depths | ~Params | |
| |--------|----------|--------|---------| |
| | `mobius_tiny_s` | (64, 128, 256) | (2, 2, 2) | 500K | |
| | `mobius_tiny_m` | (64, 128, 256, 512, 768) | (2, 2, 4, 2, 2) | 11M | |
| | `mobius_tiny_l` | (96, 192, 384, 768) | (3, 3, 3, 3) | 8M | |
| | `mobius_base` | (128, 256, 512, 768, 1024) | (2, 2, 2, 2, 2) | 33.7M | |
|
|
| ## Architecture |
|
|
| ``` |
| Input |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β Stem (Conv β BN) β |
| βββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β Stage 1-N β |
| β βββββββββββββββββββββββββββββββ β |
| β β MobiusConvBlock (Γdepth) β β |
| β β ββ Depthwise-Sep Conv β β |
| β β ββ BatchNorm β β |
| β β ββ MobiusLens (wave gate) β β |
| β β ββ Thirds Mask β β |
| β β ββ Learned Residual β β |
| β βββββββββββββββββββββββββββββββ β |
| β Downsample (stride-2 conv) β |
| βββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β Integrator (Conv β BN β GELU) β β Task collapse |
| βββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β Pool β Linear β Classes β |
| βββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## Core Components |
|
|
| ### MobiusLens |
|
|
| Wave-based gating mechanism with three interference paths: |
|
|
| ```python |
| L = wave(phase_l, drift_l) # Left path (+1 drift) |
| M = wave(phase_m, drift_m) # Middle path (0 drift, ghost) |
| R = wave(phase_r, drift_r) # Right path (-1 drift) |
| |
| # Interference |
| xor_comp = |L + R - 2*L*R| # Differentiable XOR |
| and_comp = L * R # Differentiable AND |
| |
| # Gating |
| gate = weighted_sum(L, M, R) * interference_blend |
| output = input * sigmoid(layernorm(gate)) |
| ``` |
|
|
| The middle path (M) acts as a "ghost" β present but diminished β maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure). |
|
|
| ### Thirds Mask |
|
|
| Rotating channel suppression inspired by Cantor set construction: |
|
|
| ``` |
| Layer 0: suppress channels [0:C/3] |
| Layer 1: suppress channels [C/3:2C/3] |
| Layer 2: suppress channels [2C/3:C] |
| Layer 3: back to [0:C/3] |
| ``` |
|
|
| Forces redundancy and prevents co-adaptation across channel groups. |
|
|
| ### Continuous Topology |
|
|
| Each layer samples a continuous manifold: |
|
|
| ```python |
| t = layer_idx / (total_layers - 1) # 0 β 1 |
| |
| twist_in_angle = t * Ο |
| twist_out_angle = -t * Ο |
| scales = scale_range[0] + t * scale_span |
| ``` |
|
|
| Adding layers = finer sampling of the same underlying structure. |
|
|
| ## Checkpoints |
|
|
| Saved to: `checkpoints/{variant}_{dataset}/{timestamp}/` |
|
|
| ``` |
| βββ config.json |
| βββ best_accuracy.json |
| βββ final_accuracy.json |
| βββ checkpoints/ |
| β βββ checkpoint_epoch_0010.pt |
| β βββ checkpoint_epoch_0010.safetensors |
| β βββ best_model.pt |
| β βββ best_model.safetensors |
| β βββ final_model.pt |
| β βββ final_model.safetensors |
| βββ tensorboard/ |
| ``` |
|
|
| ## TensorBoard |
|
|
| Monitor training: |
|
|
| ```bash |
| tensorboard --logdir ./outputs/checkpoints |
| ``` |
|
|
| Tracks: |
| - Loss, train/val accuracy |
| - Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights) |
| - Residual weights |
| - Weight histograms |
|
|
| ## Data Setup |
|
|
| ### Tiny ImageNet |
|
|
| ```bash |
| wget http://cs231n.stanford.edu/tiny-imagenet-200.zip |
| unzip tiny-imagenet-200.zip -d ./data/ |
| ``` |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{mobiusnet2026, |
| title={MobiusNet: Wave-Based Topological Vision Architecture}, |
| author={AbstractPhil}, |
| year={2026}, |
| url={https://huggingface.co/AbstractPhil/mobiusnet} |
| } |
| ``` |