Name: O12: Identifying Dynamic-based Closed-loop Targets for Speech Processing Cochlear Implants
Start: 2025-07-07T12:10:00+0200
End: 2025-07-07T12:30:00+0200

Monday July 7, 2025 12:10 - 12:30 CEST

Auditorium - Plenary Room

Identifying Dynamic-based Closed-loop Targets for Speech Processing Cochlear Implants

Cynthia Steinhardt*1,Menoua Keshishian2, Kim Stachenfeld1,3, Larry Abbott1
1 Center for Theoretical Neuroscience, Zuckerman Brain Science Institute, Columbia University, New York, New York USA
2 Department of Electrical Engineering, Columbia University, New York, New York USA
3 DeepMind, Google, London, United Kingdom

*Email: cs4248@columbia.edu

Introduction
Since the development of the first cochlear implant (CI) in 1957, over one million people have used these devices to regain hearing. However, CIs have a number of deficits, such as low efficacy in noise, and these deficits remain poorly understood [1]. CI algorithm research has focused on optimizing single-neuron voltage-driven activations in the cochlea, based on low-level auditory modeling but little work has focused on capturing known features of hierarchical speech processing across the brain [2]. We create a model system to investigate how CI-encoded speech affects phoneme and word comprehension, uncovering a dynamics-based signature for potential closed-loop CI applications.
Methods
We trained a DeepSpeech2 [3] model to convert spectrograms to phonemes using CTC Loss. Speech inputs were sourced from the LibriSpeech dataset. Speech was processed via the AB Generic Toolbox [4] to generate electrodograms, creating CI-transformed inputs or directly given to the model to simulate natural hearing. The model, trained on natural spectrograms, was then tested on CI-transformed inputs. Behavioral experiments were performed and compared to human results. We analyzed phoneme processing dynamics, using a distance metric to determine convergence patterns and tested dynamic signatures for feedback control [5].
Results
Our model exhibited human-like increases in phoneme reaction time with CI-transformed inputs and noise; phoneme confusion and word errors mirrored human behavior, as well [5]. Analysis revealed a specific time window per layer where correct phoneme comprehension dynamics converged for all phonemes, with increasing delays deeper in the network. We create a representation distance metric, measured via a Wasserstein metric between dynamics during comprehension and found it correlated (up to 0.78) with behavioral confusion of the model while processing these phonemes in sentences. Using a linear closed-loop controller, we then successfully pushed dynamics toward correct phoneme perception using this converged representation at a target.
Discussion
This study presents a plausible model for speech perception with and without a CI, validated against human data. We identify a dynamic signature predicting comprehension or confusion within 100 ms—a feasible intervention window. We demonstrate its use for closed-loop feedback and find evidence of human EEG evoked responses with similar dynamics [6], suggesting a potential EEG-based CI parameter selection method. We show plausibility here for a new cochlear implant paradigm, instead of mimicking cochlear processing, we determine pulse parameters that drive desired population-level neural representations of speech. This approach may generalize to other neural implants, as we understand those systems better.

Acknowledgements
We thank the Simons Society of Fellows (965377), Gatsby Charitable Trust (GAT3708), Kavli Foundation, and NIH (R01NS110893) for support.
References
● Boisvert, I., et al. (2020). CI outcomes in adults.PLoS One,15(5), e0232421.
● Rødvik, A. K., et al. (2018). CI vowel/consonant ID.J Speech Lang Hear Res,61(4), 1023-1050.
● Amodei, D., et al. (2015). Deep Speech 2.arXiv:1512.02595.
● Jabeim, A. (2024). AB-Generic-Python-Toolbox.GitHub.
● Steinhardt, C. R., et al. (2024). DeepSpeech CI performance.arXiv:2407.20535.
● Finke, M., et al. (2017). Stimulus effects on CI users.Audiol Neurotol,21(5), 305-315.

Speakers

Cynthia Steinhardt

Monday July 7, 2025 12:10 - 12:30 CEST
Auditorium - Plenary Room

Oral session talk, Oral session 3

CNS*2025 Florence

Cynthia Steinhardt

Log in to save this to your schedule, view media, leave feedback and see who's attending!