# Graph Neural Networks and Generalizable Models in Neuroscience

Constructing generalizable models in neuroscience is challenging because neural systems are typically complex in the sense that they are dynamic — constantly changing — and composed of numerous interacting components that collectively participate to produce emergent behaviors: system properties that are not shared by the building block constituent elements that make up the system.

The result is that indistinguishable macroscopic states can arise from numerous unique combinations of microscopic parameters, i.e. parameters relevant to lower scales of organization. So bottom-up approaches to modeling neural systems often fail, since a large number of microscopic configurations can lead to the same observables functions, behaviors, and outcomes.

Because neural systems are highly degenerate and complex, their analysis is not amenable to many conventional algorithms. For example, observed correlations between individual neurons and behavioral states of an organism may not generalize to other organisms or even to repeated trials in the same individual. Hence, individual variability of neural dynamics remains poorly understood and a fundamental obstacle to model development. Nevertheless, neural systems exhibit universal behaviors. In other words, individuals behave similarly despite being distinct. Motivated by the need for robust and generalizable analytical techniques, a number of research groups have recently applied tools from dynamical systems analysis to simple organisms in hopes of discovering a universal organizational principle underlying behavior. These studies, made possible by advances in whole-brain imaging, reveal that neural dynamics live on low-dimensional manifolds which map to behavioral states. This discovery implies that although microscopic neural dynamics differ between organisms, a macroscopic/global universal framework may enable generalizable algorithms in neuroscience. Yet, the need for signiﬁcant hand-engineered feature extraction in these studies underscores the potential of deep neural network learning models for scalable analyses of neural dynamics.

In a recent paper by our lab, we examined the performance and generalizability of deep learning models applied to the neural activity of the worm C. elegans. [This article is an adaptation of parts of our paper, and we refer the reader to it for a complete list of citations to the details and studies introduced below.] C. elegans is a canonical species and model in neuroscience for studying microscopic neural dynamics in an intact organism, because it remains the only organism whose connectome — the connectivity map of all its 302 neurons and their synaptic connections — is completely known and well studied. And the transparent body of these worms allows for calcium imaging of whole brain neural activity. Leveraging and building upon work from previous studies, we developed deep learning models that bridge recent advances in neuroscience and deep neural networks. We ﬁrst demonstrated state-of-the-art performance for classifying motor action states: forward and reverse crawling of C. elegans predicted from its neural activity. Next, we examined the generalization performance of our deep learning models on unseen worms both within the same study, and in worms from a separate study published years later. We were able to predict the behavior of individual worms from their recorded neural data. And what’s more, we showed that graph neural networks — which are capable of accomodating relational data structured as graphs and networks — exhibited a favourable inductive bias capable of outperforming other machine learning models on these tasks. The rest of this short article will further introduce and elaborate a bit on each of these concepts.

In order to understand the brain as a engineered system, understand how the details that make up the brain’s ‘wetware’ come together to produce the amazing things the brain does, methods like graph neural networks are going to be increasingly necessary. We simply won’t be able to construct — and discover — the algorithms of the brain without them.

# Universality and Generalizability in Neuroscience

The motor action sequence of C. elegans is one of the only systems for which experiments on whole-brain microscopic neural activity can be performed and readily analyzed. Understandably then, numerous efforts have focused on building models that can accurately capture the hierarchical nature of neural dynamics and resulting locomotive behaviors. One of these studies investigated the neural dynamics and activity corresponding to a pirouette, a motor action sequence in which worms switch from forward to backward crawling, turn, and then continue forward crawling. The analysis showed that most variations (∼65%) in neural dynamics can be expressed by three components found through principal component analysis (PCA), and that the neural dynamics in the resulting latent space trace cyclical trajectories on well-defined low dimensional manifolds corresponding to the motor action sequence. By identifying individual neurons, an experimental feat, the authors further determined that the topological structures in latent space were universally found among all five worms imaged in their study.

Building on this and other subsequent work that followed it, another study found consistent differences between each individual’s neural dynamics, precluding the use of established dimensional reduction techniques. For example, among 15 neurons uniquely identified across all 5 worms, only 3 neurons displayed statistically consistent activity. Resulting from these discrepancies, topological structures identified by performing PCA on each worm’s neural activity were no longer observed when data from all worms was pooled together. To address this issue, the authors introduced a new algorithm, Asymmetric Diffusion Map Modeling (ADMM), which maps the neural activity of any worm to an universal manifold. To achieve this, ADMM first performs time-delay embedding of neural activity into phase space. Next, a transition probability matrix is constructed by calculating distances between points in phase space using a Gaussian kernel centered on the subsequent time step. Finally, this asymmetric diffusion map is used to construct a manifold representative of neural activity. In contrast to conventional dimensional reduction techniques, ADMM allows quantitative modeling by mapping neural activity from the manifold, and enables the prediction of motor action states up to 30 sec ahead. But despite its success, the algorithm relies heavily on hyperparameters, such as embedding parameters, which are difficult to justify and tune.

# Graph Neural Networks

Graph neural networks (GNNs) are a class of neural networks that explicitly use graph structure during computations through message passing algorithms where features are passed along edges between nodes and then aggregated for each node. These networks were inspired by the success of convolutional neural networks in the domain of two-dimensional image processing and failures when extending conventional convolutional networks to non-euclidean domains. In essence, because graphs can have arbitrary structure, i.e. are not restricted to, say, the regular lattice like structure of rows and columns of pixels in an image, the inductive bias of convolutional neural networks (equivariance to translational transformations) often breaks down when applied to graphs. Addressing this issue, an early work on GNNs showed that one-hop message passing approximates spectral convolutions on graphs. Subsequent technical works have examined the representational power of GNNs in relation to the Weisfeiler-Lehman isomorphism test, and limitations of GNNs when learning graph moments. From an applied perspective, GNNs have been widely successful in a wide variety of domains including relational inference, point cloud segmentation and traffic forecasting.

# Relational Inference

Relational inference remains a longstanding challenge with early works in neuroscience attempting to quantify correlations between neurons and neuronal activity. Modern approaches to relational inference employ graph neural networks, since their explicit reliance on graph structure forms a relational inductive bias. In particular, the work summarized above in our own paper was inspired by the Neural Relational Inference model (NRI) which uses a variational autoencoder for generating edges, and a decoder for predicting trajectories of each object in a system. By inferring edges, the NRI model explicitly captures interactions between objects and leverages the resulting graph as an inductive bias for various machine learning tasks. This model was successfully used to predict the trajectories of coupled Kuramoto oscillators, particles connected by springs, the pick and roll play from basketball, and motion capture visualizations. Subsequently, these authors developed Amortized Causal Discovery, a framework based on the NRI model which infers causal relations from time-dependent data.

# Deep Learning in Neuroscience

With the success of convolutional neural networks, researchers successfully applied deep learning to numerous areas in neuroscience, including MRI imaging and mapping connectomes, where algorithms seem to be able to predict disorders such as autism. Further leveraging the explicit graph structure of neural systems, several studies have successfully applied GNNs on various tasks such as annotating cognitive states, and several frameworks based on graph neural networks have been proposed for analyzing function MRI (fMRI) data. Similarly, brain-computer interfaces (BCI) that often need to decode macroscopic variables from measurements of neural activity. These studies often involve fMRI or EEG data, which characterize neural activity on a population level, with varying amounts of success. But true to form, again, a challenge for the ﬁeld is developing generalizable algorithms to new individuals unseen during training.

# A Necessary Future for Systems Neuroscience

GNNs, like any technology, will only have its own limitations. But this new subset of machine learning is very much still in its infancy, in spite of its building list of successful and enabling applications. So who knows how far it will go and the impact it will have on studying the brain. It will likely be significant. What’s clear is that in order to understand the brain as a engineered system, understand how the details that make up the brain’s ‘wetware’ come together to produce the amazing things the brain does, methods like GNNs are going to be increasingly necessary. We simply won’t be able to construct — and discover — the algorithms of the brain without them.