Loading…
Monday July 7, 2025 16:20 - 18:20 CEST
P202 Implementation of an SNN-based LLM

Tomohiro. Mitsuhashi*1, Rin. Kuriyama1, Tadashi. Yamazaki1

1Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo, Japan

*Email:m2431154@gl.cc.uec.ac.jp

Introduction

Large language models (LLMs) are indispensable in everyday life and business, yet their training and inference demand an enormous amount of electricity. A major contributor to this consumption is the extensive memory access in artificial neural network (ANN) models. A potential solution is to use neuromorphic hardware, which emulates dynamics of spiking neural networks (SNNs)[1]. SpikeGPT has been proposed as an SNN-based LLM[2]. However, not all components in SpikeGPT are implemented by spiking neurons. In this study, we aimed to implement a fully spike-based LLM based on the present SpikeGPT.

Methods
SpikeGPT consists of two blocks: Spiking RWKV and Spiking RFFN (Fig. 1A). These blocks consist of a component that performs analog computation and another component that converts the results into spike sequences by an SNN. We replaced the former component with an SNN by using the method proposed by Stanojevic[3], that uses spike timing information (Time-to-First Spike, TFS) (Fig. 1B), where an analog value is represented by the time that a neuron emits a spike for the first time. Eventually, we developed SNN-based RWKV and SNN-based RFFN (Fig. 1A). Moreover, nonlinear processes, including the calculation of an exponential function, were approximated using multi-layer SNNs, enabling the entire processing to be implemented solely with SNNs.
Results
We completed implementation of an SNN-based LLM, ensuring that both the RWKV and RFFN blocks are SNNs. Our SNN-based LLM should have generated the same sentences with the original SpikeGPT, but it generated completely broken sentences. We performed a quantitative comparison between analog computation values and approximated ones represented by spike timing, and found discrepancies between them. Namely, the nonlinear processes by SNNs did not work well. Then, we reverted SNN-based nonlinear processes with the original analog versions. We were able to obtain readable sentences, although the sentences were still different (Fig. 1C). Notably, we confirmed that each neuron emitted at most one spike during text generation (Fig. 1D).


Discussion
We implemented an SNN-based LLM that generates sentences. Nonetheless, our SNN-based nonlinear processes need to be improved for better approximation. One possible way is to set the temporal resolution of the SNNs much smaller for finer precision of analog values represented by TFS. Meanwhile, each neuron has at most one spike for each propagation, combining our model with neuromorphic hardware could lead to significant energy savings. These advances are expected to address challenges associated with energy-efficient LLMs.



Figure 1. Overview of our SNN-based LLM and sample results. (A) The architecture of SpikeGPT (left) and our model (right). (B) Schematic of the TFS approach, where the temporal difference between the time parameter and the spike time encodes an analog value. (C) A sample sentence generated by our model. (D) Raster plots for the SNN-based RWKV and SNN-based RFFN during token generation.
Acknowledgements
This study was supported by MEXT/JSPS KAKENHI Grant Numbers JP22H05161, JP22H00460.
References
1. Davies, M., et al. (2021). Advancing neuromorphic computing with Loihi: A survey of results and outlook.Proceedings of the IEEE, 109(5), 911–934.https://doi.org/10.1109/JPROC.2021.3067593


2. Zhu, R.-J., et al. (2024). SpikeGPT: Generative pre-trained language model with spiking neural networks.arXiv preprint.https://arxiv.org/abs/2302.13939


3. Stanojevic, A., et al. (2023). An exact mapping from ReLU networks to spiking neural networks.Neural Networks, 168, 74–88.https://doi.org/10.1016/j.neunet.2023.09.011
Monday July 7, 2025 16:20 - 18:20 CEST
Passi Perduti

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link