학술논문

Error-based or target-based? A unified framework for learning in recurrent spiking networks.
Document Type
Article
Source
PLoS Computational Biology. 6/21/2022, Vol. 18 Issue 6, p1-18. 18p. 2 Diagrams, 3 Graphs.
Subject
*SUPERVISED learning
*BIOLOGICAL networks
*RECURRENT neural networks
*BIOLOGICAL systems
*LONG-term memory
Language
ISSN
1553-734X
Abstract
The field of recurrent neural networks is over-populated by a variety of proposed learning rules and protocols. The scope of this work is to define a generalized framework, to move a step forward towards the unification of this fragmented scenario. In the field of supervised learning, two opposite approaches stand out, error-based and target-based. This duality gave rise to a scientific debate on which learning framework is the most likely to be implemented in biological networks of neurons. Moreover, the existence of spikes raises the question of whether the coding of information is rate-based or spike-based. To face these questions, we proposed a learning model with two main parameters, the rank of the feedback learning matrix R and the tolerance to spike timing τ⋆. We demonstrate that a low (high) rank R accounts for an error-based (target-based) learning rule, while high (low) tolerance to spike timing promotes rate-based (spike-based) coding. We show that in a store and recall task, high-ranks allow for lower MSE values, while low-ranks enable a faster convergence. Our framework naturally lends itself to Behavioral Cloning and allows for efficiently solving relevant closed-loop tasks, investigating what parameters (R,τ⋆) are optimal to solve a specific task. We found that a high R is essential for tasks that require retaining memory for a long time (Button and Food). On the other hand, this is not relevant for a motor task (the 2D Bipedal Walker). In this case, we find that precise spike-based coding enables optimal performances. Finally, we show that our theoretical formulation allows for defining protocols to estimate the rank of the feedback error in biological networks. We release a PyTorch implementation of our model supporting GPU parallelization. Author summary: Learning in biological or artificial networks means changing the laws governing the network dynamics in order to better behave in a specific situation. However, there exists no consensus on what rules regulate learning in biological systems. To face these questions, we propose a novel theoretical formulation for learning with two main parameters, the number of learning constraints (R) and the tolerance to spike timing (τ⋆). We demonstrate that a low (high) rank R accounts for an error-based (target-based) learning rule, while high (low) tolerance to spike timing τ⋆ promotes rate-based (spike-based) coding. Our approach naturally lends itself to Imitation Learning (and Behavioral Cloning in particular) and we apply it to solve relevant closed-loop tasks such as the button-and-food task, and the 2D Bipedal Walker. The button-and-food is a navigation task that requires retaining a long-term memory, and benefits from a high R. On the other hand, the 2D Bipedal Walker is a motor task and benefits from a low τ⋆. Finally, we show that our theoretical formulation suggests protocols to deduce the structure of learning feedback in biological networks. [ABSTRACT FROM AUTHOR]