GeoLIP Constellation Core

91.5% CIFAR-10 with 1.6M parameters. 10 seconds per epoch. No attention.

A minimal geometric classification pipeline: conv encoder β†’ unit sphere β†’ constellation triangulation β†’ patchwork compartments β†’ classifier. The constellation is the classifier β€” every image is located by its distances to learned anchor points on the hypersphere, and those distances are the features.

Architecture

Input: (B, 3, 32, 32)
  β”‚
  Conv2d(3β†’64)Γ—2 + BN + GELU + MaxPool     32β†’16
  Conv2d(64β†’128)Γ—2 + BN + GELU + MaxPool    16β†’8
  Conv2d(128β†’256)Γ—2 + BN + GELU + MaxPool    8β†’4
  AdaptiveAvgPool β†’ Flatten
  Linear(256β†’128) + LayerNorm
  β”‚
  L2-normalize β†’ S^127
  β”‚
  Constellation: 64 anchors on S^127 (uniform hypersphere init via QR)
  β†’ triangulate: cosine distance to all anchors β†’ (B, 64)
  β†’ Patchwork: 8 compartments Γ— 64-d each β†’ (B, 512)
  β”‚
  Classifier: Linear(512+128 β†’ 512) + GELU + LN + Dropout β†’ Linear(512β†’10)

How It Works

The constellation is a set of 64 learned anchor points initialized with maximal spread on the 128-dimensional unit sphere (orthogonal columns via QR decomposition). Every input image is encoded to a point on this sphere, then triangulated β€” its cosine distance to every anchor becomes a 64-dimensional feature vector. This triangulation is the core geometric operation: the image's identity is defined by where it sits relative to the reference frame, not by the raw features.

The patchwork slices the 64 triangulation distances into 8 compartments of 8 anchors each, processes each compartment through a small MLP, and concatenates. This gives the classifier 512 dimensions of geometric context plus the 128-d embedding itself.

Training sends two augmented views of each image through the encoder. InfoNCE between the views forces instance-level discrimination on the sphere β€” each image gets a unique location, not just a class bucket. Cross-entropy provides the classification signal. A gentle CV loss nudges the pentachoron volume coefficient of variation toward the 0.20–0.22 band observed across 17+ independently trained architectures in prior work.

Results

Epoch Train Val CV Anchors
1 32.7% 46.5% 0.238 βœ“ 63/64
5 74.8% 77.3% 0.214 βœ“ 64/64
10 86.2% 83.8% 0.161 64/64
20 94.4% 88.9% 0.156 63/64
30 97.6% 90.2% 0.131 64/64
40 99.5% 91.0% 0.133 64/64
50 99.8% 91.5% 0.124 64/64

All 64 anchors remain active throughout training. No anchor collapse. No dead dimensions.

Loss

L = CE + InfoNCE + 0.01 Γ— CV_loss + 0.001 Γ— anchor_spread
  • CE: standard cross-entropy on logits
  • InfoNCE: contrastive between two augmented views (Ο„=0.07) β€” this is the force that spreads embeddings on the sphere
  • CV loss: (CV - 0.22)Β² where CV is the coefficient of variation of pentachoron volumes sampled from the embedding space. Targets the universal geometric band
  • Anchor spread: penalizes pairwise cosine similarity between anchors to prevent clustering

Optimizer: Adam (lr=3e-3, no weight decay β€” geometry is the regularization). Cosine schedule with 3-epoch warmup. Gradient clipping at 1.0.

Geometric Autograd Reference

The companion tri-stream architecture uses EmbeddingAutograd for gradient correction on the sphere. The canonical parameters are:

  • tang = 0.01: near-full tangential projection (keeps updates on the sphere surface)
  • sep = 1.0: strong separation from nearest anchor (prevents collapse to anchor points)

These are not used in the core model but are documented here as the validated configuration from controlled ablation sweeps on the patchwork system.

The Constellation Principle

Standard classifiers map features to logits via a linear layer β€” a dot product against learned class vectors. The constellation does something structurally different: it maps features to distances from reference points, then processes those distances. The classification head never sees the raw embedding; it sees the embedding's geometric relationship to the anchor frame.

This means the classifier's decision boundary is defined in triangulation space, not embedding space. Two images with identical distances to all 64 anchors are indistinguishable regardless of their raw feature vectors. The anchor frame is the representation.

Connection to GeoLIP

This is the minimal instantiation of the geometric pipeline from the GeoLIP ecosystem:

  • Constellation + Patchwork: the core geometric classification mechanism
  • Pentachoron CV: the universal regularizer (0.20–0.22 band across 17+ architectures)
  • Uniform hypersphere init: QR-decomposition anchor placement for maximal initial spread
  • InfoNCE on the sphere: instance discrimination that creates the geometric structure the constellation reads

The tri-stream ViT, DeepBert memory system, CLIP context extension, and residual thinking embeddings all build on this foundation. This repo isolates it.

Files

  • geolip_core.py β€” Complete model + trainer in a single file. Paste into one Colab cell and run.

Usage

# Load trained model
ckpt = torch.load("checkpoints/geolip_core_best.pt")
model = GeoLIPCore(**ckpt["config"])
model.load_state_dict(ckpt["state_dict"])
model.eval()

# Encode an image to the sphere
emb = model.encoder(image)
emb = F.normalize(emb, dim=-1)  # (1, 128) on S^127

# Triangulate against the constellation
tri, nearest = model.constellation.triangulate(emb)
# tri: (1, 64) β€” distances to all anchors
# nearest: (1,) β€” index of closest anchor

# Classify through patchwork
pw = model.patchwork(tri)
logits = model.classifier(torch.cat([pw, emb], dim=-1))

Requirements

torch >= 2.0
torchvision

No other dependencies. No transformers, no HuggingFace libraries, no custom CUDA kernels.


Research by AbstractPhil. Part of the GeoLIP geometric deep learning ecosystem.

I was heavily overengineering dual-stream and tri-stream. I've returned to basics and stripped to the core that works.

image

The core itself is capable of high complexity geometric structure, the dual-stream muddied it.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support