Metasimulation and Workflow Automation with Application in Computational Neuroscience
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Increasing model complexity, higher numbers of simulations, and improved computational resources make manual methods a bottleneck in computational neuroscience research. We present four progressive essays that describe new mental constructs and software tools that automate daily simulation workflow; facilitate access to servers, clusters, and cloud; automate access to a public experimental database; and form an "investigation database" which links input parameter vectors, experimental results, and post--simulation analyses.
In Essay I, "NeuroManager: A workflow analysis based simulation management engine for computational neuroscience", we developed NeuroManager, an object--oriented simulation management software engine for computational neuroscience. NeuroManager automates the workflow of simulation job submissions when using heterogeneous computational resources, simulators, and simulation tasks. The object-oriented approach 1) provides flexibility to adapt to a variety of neuroscience simulators, 2) simplifies the use of heterogeneous computational resources, from desktops to super computer clusters, and 3) improves tracking of simulator/simulation evolution. We implemented NeuroManager in MATLAB, a widely used engineering and scientific language, for its signal and image processing tools, prevalence in electrophysiology analysis, and increasing use in college Biology education. To design and develop NeuroManager we analyzed the workflow of simulation submission for a variety of simulators, operating systems, and computational resources, including the handling of input parameters, data, models, results, and analyses. This resulted in twenty--two stages of simulation submission workflow. The software incorporates progress notification, automatic organization, labeling, and time-stamping of data and results, and integrated access to Matlab’s analysis and visualization tools. NeuroManager provides users with the tools to automate daily tasks, and assists principal investigators in tracking and recreating the evolution of research projects performed by multiple people. Overall, NeuroManager provides the infrastructure needed to improve workflow, manage multiple simultaneous simulations, and maintain provenance of the potentially large amounts of data produced during the course of a research project.
Essay II, "Power-Law Dynamics of Membrane Conductances Increase Spiking Diversity in a Hodgkin--Huxley Model", describes a collaborative effort in which Dr. Wondimu Teka used the NeuroManager software we had developed to perform many complex, multi--day computer simulations on remote clusters for studying fractional conductances in a neuron model. We collaborated daily with Dr. Teka to improve NeuroManager's performance, usability, clarity, transparency, and efficiency. As a result, NeuroManager was advanced and tested in an intense computational neuroscience environment.
Essay II Abstract: We studied the effects of non-–Markovian power–-law voltage dependent conductances on the generation of action potentials and spiking patterns in a Hodgkin-–Huxley model. To implement slow-–adapting power–-law dynamics of the gating variables of the potassium, n, and sodium, m and h, conductances we used fractional derivatives of order
Essay III is entitled, "Automating NEURON simulation deployment in cloud resources". Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon’s proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager, so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model. Automating interactions with the increasing number and size of neuroscience databases is crucial to make their data useful for research and discovery.
In Essay IV, "Integrating the Allen Brain Institute Cell Types Database into automated neuroscience workflow", we developed a suite of tools to download, extract features, and organize the Cell Types Database from the Allen Brain Institute in order to integrate its whole cell patch clamp characterization data into the automated modeling/data analysis cycle. To expand the potential user base we employed both Python and MATLAB. The basic set of tools downloads selected raw data and extracts cell, sweep, and spike features, using ABI’s feature extraction code. For focus of the experimental data and to minimize impact on the ABI site we added a tool to build a local specialized database of raw data plus extracted features. Finally, to maximize automation, we extended our MATLAB–based NeuroManager workflow automation suite to include these tools plus a separate investigation database. The extended suite allows the user to integrate experimental data from the ABI into an automated workflow deployed on heterogeneous computer infrastructures, from local servers, to high performance computing environments, to the cloud. The previous version of NeuroManager focused on the management and deployment of simulations in computational neuroscience; now the automated workflow allows the combined use of external experimental data, managed simulations, and data analysis, and places workflow data such as input parameters, results, extracted features, analyses, and comparisons into a searchable, exploitable investigation database.
In the last Essay, "Discussion," we put the entire research chain in perspective and discuss future possibilities in a post--essay Discussion.