Loading…
Monday July 7, 2025 16:20 - 18:20 CEST
Curl Descent

Hugo Ninou*1, Jonathan Kadmon2, N. Alex Cayco Gajic1

1Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Département d'Etudes Cognitives, Ecole normale supérieure, PSL University, Paris, France
2 Edmond and Lilly Center for Brain Sciences, The Hebrew University, Jerusalem

*Email: hugo.ninou@ens.fr


Introduction
Gradient-based algorithms are fundamental to artificial neural network training, yet their direct biological correlates remain elusive. However, it is unclear if the diverse synaptic plasticity rules observed in neural circuits approximate gradient descent. We investigate whether learning dynamics can include non-gradient "curl" components yet still optimize a loss function. Such curl terms can naturally emerge from sign-diverse mechanisms like inhibitory-excitatory connectivity or mixed Hebbian/anti-Hebbian plasticity. These yield dynamics not describable as gradient descent on any single objective function [1]. Understanding these non-gradient aspects is key for more biologically plausible learning theories.


Methods
We analyzed two-layer linear feedforward neural networks with scalar output within a tractable student-teacher framework [2]. The networks' weights were initialized to be i.i.d. random variables with variance scaled as 1/Npresyn_neurons. Non-gradient dynamics were introduced by incorporating anti-Hebbian-like plasticity, modeled as sign flips in the gradient descent learning update (Fig. 1). Using elements from random matrix theory, we examined the stability of the solution manifold as a function of the fraction of these neurons and the network's compression ratio. Simulations tested our theory in linear networks and were extended to non-linear tanh networks to investigate convergence speed.


Results
Our analysis reveals that expansive networks help preserve the stability of the original solution manifold, yielding learning dynamics qualitatively akin to gradient descent. However, beyond a critical threshold, strong curl terms destabilize this manifold. When destabilizing the solution manifold by introducing anti-Hebbian-like neurons in the hidden layer, chaotic learning dynamics impair performance. Conversely, destabilizing the solution manifold with anti-Hebbian-like neurons in the readout layer can paradoxically accelerate learning compared to gradient descent without damaging performance. This speed-up was also observed in non-linear tanh networks.


Discussion
The findings demonstrate that sign diversity in biologically plausible plasticity rules naturally gives rise to non-gradient curl terms. These terms can significantly alter learning dynamics, leading to either instability or, counterintuitively, faster convergence by navigating the loss landscape differently than pure gradient descent. Our results identify specific network architectures that can support robust learning via such diverse rules. This provides an important counterpoint to purely gradient-based normative theories of learning in neural networks and suggests that biological irregularity in plasticity might be a feature, not a bug, for efficient learning [3].





This project was supported by Ens, INSERM and Agence Nationale de la Recherche.

  1. Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J., & Hinton, G. (2020). Backpropagation and the brain. Nature Reviews Neuroscience, 21(6), 335-346. https://doi.org/10.1038/s41583-020-0277-3
  2. Saxe, A. M., McClelland, J. L., & Ganguli, S. (2014). Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv:1312.6120. https://doi.org/10.48550/arXiv.1312.6120
  3. Richards, B. A., & Kording, K. P. (2023). The study of plasticity has always been about gradients. The Journal of Physiology, JP282747. https://doi.org/10.1113/JP282747

Monday July 7, 2025 16:20 - 18:20 CEST
Passi Perduti

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link