P305 Developing a Digital Twin of the Drosophila Optical Lobe: A Large-Scale Autoencoder Trained on Natural Visual Inputs Using Complete Connectome Data
The optic lobe is the main visual system of Drosophila, involved in functions like motion detection [2]. Recent advances in connectome projects have provided near-complete synaptic maps [1,3,8], enabling detailed circuit analyses. A recent study trained a connectome based neural network to reproduce motion detection properties of neurons T4 and T5, assuming vector teaching signals like optical flow, which are absent in biological circuitry. In this study, we use the right optic lobe’s connectivity from FlyWire [5,8] to build a large-scale autoencoder, where the visual input itself serves as the teaching signal [6]. In doing so, we aim to develop a digital twin of the drosophila optical lobe under biologically plausible training conditions.
Methods We derived a synaptic adjacency matrix from the entire right optic lobe, yielding about 45,000 nodes and over 4.5 million edges [5]. Photoreceptors (R1–R6) served as both input and output in an autoencoder that preserves feedforward and feedback connections [6]. We trained it with natural video stimuli, adjusting synaptic weights to minimize reconstruction error between initial and reconstructed signals. Each iteration also incorporated slight temporal offsets to assess predictive capacity. Neuronal activity was then analyzed by topological distance from the photoreceptors, allowing us to track signal propagation through deeper layers [2]. Results After training, the autoencoder accurately reconstructed photoreceptor inputs, achieving low mean squared error across varied visual contexts. Neurons beyond superficial lamina layers showed moderate activity, implying that deeper circuits were engaged, though not intensely. Under prolonged stimulation, activation patterns stabilized, suggesting recurrent loops that dampen fluctuations. These results align with reports that feedback modulates photoreceptors to maintain sensitivity [6]. Performance analyses indicated that minor temporal offsets improved predictive accuracy, hinting that the network captures short-term correlations in visual input. Discussion Our findings show that a connectome-based autoencoder, using the entire right optic lobe, can reconstruct visual inputs while incorporating known feedback loops. By preserving anatomical wiring [5,8], the model reveals how structural constraints inform function. Compared to approaches that highlight local motion detection [4] or rely on supervised learning [3], our unsupervised method uncovers emergent coding without explicit tasks. Although deep-layer neurons were only moderately active, their engagement suggests hierarchical processing aids reconstruction [2]. Future studies could dissect subnetworks for contrast gain or motion detection to clarify how feedback refines perception [1,6].
Acknowledgements This work has been supported by the Mohammed bin Salman Center for Future Science and Technology for Saudi-Japan Vision 2030 at The University of Tokyo (MbSC2030) and JSPS KAKENHI Grant Number 23K25257. References ● https://doi.org/10.7554/eLife.57443 ● https://doi.org/10.1146/annurev-neuro-080422-111929 ● https://doi.org/10.1038/s41586-024-07939-3 ● https://doi.org/10.1016/j.cub.2015.07.014 ● https://doi.org/10.1038/s41592-021-01330-0 ● https://doi.org/10.1371/journal.pbio.1002115 ● https://doi.org/10.1007/s00359-019-01375-9 ● https://doi.org/10.1038/s41586-024-07558-y