6.5DCMay 15
Scalable Construction of Spiking Neural Networks using up to thousands of GPUsBruno Golosio, Gianmarco Tiddia, José Villamar et al.
Diverse scientific and engineering research areas deal with discrete, time-stamped changes in large systems of interacting delay differential equations. Simulating such complex systems at scale on high-performance computing clusters demands efficient management of communication and memory. Inspired by the human cerebral cortex -- a sparsely connected network of $\mathcal{O}(10^{10})$ neurons, each forming $\mathcal{O}(10^{3})$--$\mathcal{O}(10^{4})$ synapses and communicating via short electrical pulses called spikes -- we study the simulation of large-scale spiking neural networks for computational neuroscience research. This work presents a novel network construction method for multi-GPU clusters and upcoming exascale supercomputers using the Message Passing Interface (MPI), where each process builds its local connectivity and prepares the data structures for efficient spike exchange across the cluster during state propagation. We demonstrate scaling performance of two cortical models using point-to-point and collective communication, respectively.
2.0LGNov 23, 2023
Empirical Comparison between Cross-Validation and Mutation-Validation in Model SelectionJinyang Yu, Sami Hamdan, Leonard Sasse et al.
Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and $k$-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding three posterior probabilities: practical equivalence, CV superiority, and MV superiority. We also evaluated the differences in the capacity of the selected models and computational efficiency. We found that both MV and CV select models with practically equivalent generalization performance across various machine learning algorithms and the majority of benchmark datasets. MV exhibited advantages in terms of selecting simpler models and lower computational costs. However, in some cases MV selected overly simplistic models leading to underfitting and showed instability in hyperparameter selection. These limitations of MV became more evident in the evaluation of a real-world neuroscientific task of predicting sex at birth using brain functional connectivity.
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary TasksAlessio Quercia, Erenus Yildiz, Zhuo Cao et al.
Monocular depth estimation (MDE) is a challenging task in computer vision, often hindered by the cost and scarcity of high-quality labeled datasets. We tackle this challenge using auxiliary datasets from related vision tasks for an alternating training scheme with a shared decoder built on top of a pre-trained vision foundation model, while giving a higher weight to MDE. Through extensive experiments we demonstrate the benefits of incorporating various in-domain auxiliary datasets and tasks to improve MDE quality on average by ~11%. Our experimental analysis shows that auxiliary tasks have different impacts, confirming the importance of task selection, highlighting that quality gains are not achieved by merely adding data. Remarkably, our study reveals that using semantic segmentation datasets as Multi-Label Dense Classification (MLDC) often results in additional quality gains. Lastly, our method significantly improves the data efficiency for the considered MDE datasets, enhancing their quality while reducing their size by at least 80%. This paves the way for using auxiliary data from related tasks to improve MDE quality despite limited availability of high-quality labeled data. Code is available at https://jugit.fz-juelich.de/ias-8/mdeaux.
10.2CVMar 11, 2025
1LoRA: Summation Compression for Very Low-Rank AdaptationAlessio Quercia, Zhuo Cao, Arya Bangun et al.
Parameter-Efficient Fine-Tuning (PEFT) methods have transformed the approach to fine-tuning large models for downstream tasks by enabling the adjustment of significantly fewer parameters than those in the original model matrices. In this work, we study the "very low rank regime", where we fine-tune the lowest amount of parameters per linear layer for each considered PEFT method. We propose 1LoRA (Summation Low-Rank Adaptation), a compute, parameter and memory efficient fine-tuning method which uses the feature sum as fixed compression and a single trainable vector as decompression. Differently from state-of-the-art PEFT methods like LoRA, VeRA, and the recent MoRA, 1LoRA uses fewer parameters per layer, reducing the memory footprint and the computational cost. We extensively evaluate our method against state-of-the-art PEFT methods on multiple fine-tuning tasks, and show that our method not only outperforms them, but is also more parameter, memory and computationally efficient. Moreover, thanks to its memory efficiency, 1LoRA allows to fine-tune more evenly across layers, instead of focusing on specific ones (e.g. attention layers), improving performance further.
4.8NEFeb 28, 2022
Exploring hyper-parameter spaces of neuroscience models on high performance computers with Learning to LearnAlper Yegenoglu, Anand Subramoney, Thorsten Hater et al.
Neuroscience models commonly have a high number of degrees of freedom and only specific regions within the parameter space are able to produce dynamics of interest. This makes the development of tools and strategies to efficiently find these regions of high importance to advance brain research. Exploring the high dimensional parameter space using numerical simulations has been a frequently used technique in the last years in many areas of computational neuroscience. High performance computing (HPC) can provide today a powerful infrastructure to speed up explorations and increase our general understanding of the model's behavior in reasonable times.
2.3DCJul 29, 2019
Staged deployment of interactive multi-application HPC workflowsWouter Klijn, Sandra Diaz-Pier, Abigail Morrison et al.
Running scientific workflows on a supercomputer can be a daunting task for a scientific domain specialist. Workflow management solutions (WMS) are a standard method for reducing the complexity of application deployment on high performance computing (HPC) infrastructure. We introduce the design for a middleware system that extends and combines the functionality from existing solutions in order to create a high-level, staged user-centric operation/deployment model. This design addresses the requirements of several use cases in the life sciences, with a focus on neuroscience. In this manuscript we focus on two use cases: 1) three coupled neuronal simulators (for three different space/time scales) with in-transit visualization and 2) a closed-loop workflow optimized by machine learning, coupling a robot with a neural network simulation. We provide a detailed overview of the application-integrated monitoring in relationship with the HPC job. We present here a novel usage model for large scale interactive multi-application workflows running on HPC systems which aims at reducing the complexity of deployment and execution, thus enabling new science.
Closed loop interactions between spiking neural network and robotic simulators based on MUSIC and ROSPhilipp Weidel, Mikael Djurfeldt, Renato Duarte et al.
In order to properly assess the function and computational properties of simulated neural systems, it is necessary to account for the nature of the stimuli that drive the system. However, providing stimuli that are rich and yet both reproducible and amenable to experimental manipulations is technically challenging, and even more so if a closed-loop scenario is required. In this work, we present a novel approach to solve this problem, connecting robotics and neural network simulators. We implement a middleware solution that bridges the Robotic Operating System (ROS) to the Multi-Simulator Coordinator (MUSIC). This enables any robotic and neural simulators that implement the corresponding interfaces to be efficiently coupled, allowing real-time performance for a wide range of configurations. This work extends the toolset available for researchers in both neurorobotics and computational neuroscience, and creates the opportunity to perform closed-loop experiments of arbitrary complexity to address questions in multiple areas, including embodiment, agency, and reinforcement learning.