Name: P271: Accelerated cortical microcircuit simulations on massively distributed memory
Start: 2025-07-08T17:00:00+0200
End: 2025-07-08T19:00:00+0200

Tuesday July 8, 2025 17:00 - 19:00 CEST

Passi Perduti

P271 Accelerated cortical microcircuit simulations on massively distributed memory

Catherine M. Schoefmann*1,2, Jan Finkbeiner1,2, Susanne Kunkel1

1Neuromorphic Software Ecosystems (PGI-15), Juelich Research Centre, Juelich,
Germany
2RWTH Aachen University, Aachen, Germany

*Email: c.schoefmann@fz-juelich.de
Introduction
Comprehensive simulation studies of dynamical regimes of cortical networks with realistic synaptic densities depend on compute systems capable of running such models significantly faster than biological real time. Since CPUs still are the primary target for established simulators, an inherent bottleneck caused by the von Neumann design is frequent memory access with minimal compute. Distributed memory architectures, popularized by the need for massively parallel and scalable processing for AI workloads, offer an alternative.

Methods
We introduce extensible simulation technology for spiking networks on massively distributed memory using Graphcore's IPUs (https://www.graphcore.ai). We demonstrate the efficiency of the new technology based on simulations of the microcircuit model by [1] commonly used as a reference benchmark. The model represents 1~mm² of cortical tissue, spanning around 300 million synapses, and is considered a building block of cortical function. Spike dynamics are statistically verified by comparison with the same simulations run on CPU with NEST[2].

Results
We present a custom semi-directed communication algorithm especially suited for distributed and constrained memory environments, which allows a controlled trade-off between performance and memory usage. Our simulation code achieves an acceleration factor of 15x compared to real time for the full-scale cortical microcircuit model on the smallest device configuration capable of fitting the model in memory. This is competitive with the current record performance on a static FPGA cluster[3], and further speedup can be achieved at the cost of lower precision weights.

Discussion
With negligible compilation times, the simulation code can be be extended seamlessly to a wide range of synapse and neuron models, as well as structural plasticity, unlocking a new class of models for extensive parameter-space explorations in computational neuroscience. Furthermore, we believe that our algorithm for scalable and parallelisable communication can be efficiently applied to different platforms.
Acknowledgements
The presented conceptual and algorithmic work is part of our long-term collaborative project to provide the technology for neural systems simulations (https://www.nest-initiative.org).
Compute time on a Graphcore Bow Pod64 has been granted by Argonne Leadership Computing Facility (ALCF).
This work is partly funded by Volkswagen Foundation.
References
[1]:https://doi.org/10.1093/cercor/bhs358
[2]:https://doi.org/10.5281/ZENODO.12624784
[3]:https://doi.org/10.3389/fncom.2023.1144143

Speakers

CNS*2025 Florence

Catherine Mia Schoefmann

Jan Finkbeiner

Susanne Kunkel

CNS*2025 Florence

Catherine Mia Schoefmann

Jan Finkbeiner

Susanne Kunkel

Log in to save this to your schedule, view media, leave feedback and see who's attending!