Papers
arxiv:2506.14951

Flat Channels to Infinity in Neural Loss Landscapes

Published on May 8
Authors:
,
,
,
,

Abstract

Neural network loss landscapes contain special channels where loss decreases slowly while neuron weights diverge, forming gated linear units at convergence and appearing as quasi-flat regions during optimization.

The loss landscapes of neural networks contain minima and saddle points that may be connected in flat regions or appear in isolation. We identify and characterize a special structure in the loss landscape: channels along which the loss decreases extremely slowly, while the output weights of at least two neurons, a_i and a_j, diverge to pminfinity, and their input weight vectors, w_i and w_j, become equal to each other. At convergence, the two neurons implement a gated linear unit: a_iσ(w_i cdot x) + a_jσ(w_j cdot x) rightarrow σ(w cdot x) + (v cdot x) σ'(w cdot x). Geometrically, these channels to infinity are asymptotically parallel to symmetry-induced lines of critical points. Gradient flow solvers, and related optimization methods like SGD or ADAM, reach the channels with high probability in diverse regression settings, but without careful inspection they look like flat local minima with finite parameter values. Our characterization provides a comprehensive picture of these quasi-flat regions in terms of gradient dynamics, geometry, and functional interpretation. The emergence of gated linear units at the end of the channels highlights a surprising aspect of the computational capabilities of fully connected layers.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2506.14951
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.14951 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.14951 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.