Florian Rupprecht*¹, Jason Kai¹, Biraj Shrestha¹, Steven Giavasis¹, Tristan Glatard², Michael P. Milham¹, Gregory Kiar¹
¹Center for Data Analytics, Innovation, and Rigor, Child Mind Institute, New York, USA ²Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, Canada *Email: florian.rupprecht@childmind.org
Introduction Python and R dominate scientific computing, yet neuroimaging relies heavily on command-line interfaces (CLIs). Manually writing wrappers for all tools across packages like FSL [1], AFNI [2], and FreeSurfer [3] is time-consuming and error-prone. Existing solutions (Nipype [4], Snakemake [5]) are language-specific and tightly coupled to workflow engines. We present Styx, a compiler generating type-safe, language-native wrappers from tool metadata, and NiWrap, a collection of 1900+ neuroimaging tool descriptions, enabling seamless CLI integration across programming languages.
Methods Styx uses a three-phase compiler: frontends parse Boutiques [6]/CWL [7] descriptors, an intermediate representation enables optimization, and backends generate Python/R/TypeScript code. We extended Boutiques with regular grammar concepts (hierarchy, alternation, repetition) to model complex CLIs like FSL's fslmaths. For NiWrap, we extracted metadata directly from MRTrix3 [8]/Workbench [9] source code (100% coverage) and used LLM-assisted generation for others (60-95% coverage). Runtime supports local/Docker/Singularity execution with sandboxed I/O. Results Generated wrappers provide IDE autocompletion, static type checking, and consistent APIs across languages. Coverage: Workbench (202/202), MRTrix3 (115/121), AFNI (565/611), FreeSurfer (707/800), FSL (239/310), ANTs [10] (71/113). Case study comparing Bash/Nipype/Snakemake/Styx pipelines showed Styx requires only base language knowledge while maintaining type safety. Decoupled from workflow engines, Styx integrates with any parallel processing framework. Discussion Styx bridges legacy neuroimaging tools with modern programming practices without imposing workflow patterns. The compiler architecture ensures maintainability - improvements propagate to all tools/languages automatically. Limitations include no built-in distributed computing and challenges in automated testing. Future work includes CWL support and content-aware interfaces. Styx provides a sustainable solution for modernizing scientific software ecosystems while preserving validated legacy implementations.
Supported by NIMH Award RF1MH130859.
[1] Jenkinson et al. (2012). FSL. NeuroImage 62(2):782-790 [2] Cox (1996). AFNI. Comput Biomed Res 29(3):162-173 [3] Fischl (2012). FreeSurfer. NeuroImage 62(2):774-781 [4] Gorgolewski et al. (2011). Nipype. doi:10.3389/fninf.2011.00013 [5] Köster & Rahmann (2018). Snakemake. Bioinformatics 34(20):3600 [6] Glatard et al. (2018). Boutiques. GigaScience 7(5) [7] Amstutz et al. (2016). CWL. doi:10.6084/m9.figshare.3115156.v2 [8] Tournier et al. (2019). MRTrix3. NeuroImage 202:116137 [9] Marcus et al. (2013). Workbench. NeuroImage 80:202-219 [10] Avants et al. (2009). ANTs. Insight J 2(365):1-35