July 14, 2025
The GIST What a folding ruler can tell us about neural networks
Sadie Harley
scientific editor
Andrew Zinin
lead editor
Editors' notes
This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:
fact-checked
peer-reviewed publication
trusted source
proofread

Deep neural networks are at the heart of artificial intelligence, ranging from pattern recognition to large language and reasoning models like ChatGPT. The principle: during a training phase, the parameters of the network's artificial neurons are optimized in such a way that they can carry out specific tasks, such as autonomously discovering objects or characteristic features in images.
How exactly this works, and why some neural networks are more powerful than others, isn't easy to understand. A rigorous mathematical description seems out of reach of current techniques. However, such an understanding is important if one wants to build artificial intelligence while minimizing resources.
A team of researchers led by Prof. Dr. Ivan Dokmanić at the Department for Mathematics and Computer Science of the University of Basel have now developed a surprisingly simple model that reproduces the main features of deep neural networks and that allows one to optimize their parameters. They published their results in Physical Review Letters.
Division of labor in a neural network
Deep neural networks consist of several layers of neurons. When learning to classify objects in images, the network approaches the answer layer by layer. This gradual approach, during which two classes—for instance, "cat" and "dog"—are more and more clearly distinguished, is called data separation.
"Usually each layer in a well-performing network contributes equally to the data separation, but sometimes most of the work is done by deeper or shallower layers," says Dokmanić.
This depends, among other things, on how the network is constructed: do the neurons simply multiply incoming data with a particular factor, which experts would call "linear"? Or do they carry out more complex calculations—in other words, is the network is "nonlinear"?
A further consideration: in most cases, the training phase of neural networks also contains an element of randomness or noise. For instance, in each training round a random subset of neurons can simply be ignored regardless of their input. Strangely, this noise can improve the performance of the network.
"The interplay between nonlinearity and noise results in very complex behavior which is challenging to understand and predict," says Dokmanić.
"Then again, we know that an equalized distribution of data separation between the layers increases the performance of networks."
So, to be able to make some progress, Dokmanić and his collaborators took inspiration from physical theories and developed macroscopic mechanical models of the learning process which can be intuitively understood.
Pulling and shaking the folding ruler
One such model is a folding ruler whose individual sections correspond to the layers of the neural network and that is pulled open at one end. In this case, the nonlinearity comes from the mechanical friction between the sections. Noise can be added by erratically shaking the end of the folding ruler while pulling.
The result of this simple experiment: if one pulls the ruler slowly and steadily, the first sections unfold while the rest remains largely closed.
"This corresponds to a neural network in which the data separation happens mainly in the shallow layers," explains Cheng Shi, a Ph.D. student in Dokmanić's group and first author of the study. Conversely, if one pulls fast while shaking it a little bit, the folding ruler ends up nicely and evenly unfolded. In a network, this would be a uniform data separation.
"We have simulated and mathematically analyzed similar models with blocks connected by springs, and the agreement between the results and those of 'real' networks is almost uncanny," says Shi.
The Basel researchers are planning to apply their method to large language models soon. In general, such mechanical models could be used in the future to improve the training of high-performance deep neural networks without the trial-and-error approach that is traditionally used to determine optimal values of parameters like noise and nonlinearity.
More information: Cheng Shi et al, Spring-Block Theory of Feature Learning in Deep Neural Networks, Physical Review Letters (2025). DOI: 10.1103/ys4n-2tj3
Journal information: Physical Review Letters Provided by University of Basel Citation: What a folding ruler can tell us about neural networks (2025, July 14) retrieved 14 July 2025 from https://techxplore.com/news/2025-07-ruler-neural-networks.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.
Explore further
Multisynapse optical network outperforms digital AI models 0 shares
Feedback to editors