Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems

Giacomo Turri ¹, Luigi Bonati ¹, Kai Zhu², Massimiliano Pontil^1,3, and Pietro Novelli¹

¹ Istituto Italiano di Tecnologia
² Zhejiang University
³ AI Centre, University College London

We introduce an encoder-only approach to learn the evolution operators of large-scale non-linear dynamical systems, such as those describing complex natural phenomena. Evolution operators are particularly well-suited for analyzing systems that exhibit complex spatio-temporal patterns and have become a key analytical tool across various scientific communities. As terabyte-scale weather datasets and simulation tools capable of running millions of molecular dynamics steps per day are becoming commodities, our approach provides an effective tool to make sense of them from a data-driven perspective. The core of it lies in a remarkable connection between self-supervised representation learning methods and the recently established learning theory of evolution operators.

Paper Code Data Models

Introduction

Dynamical systems are crucial for understanding phenomena across various scientific disciplines, from physics and biology to climate science. Traditionally, these systems are modeled using differential equations derived from first principles. However, as systems grow in scale and complexity, this approach quickly becomes computationally intractable and difficult to interpret, hindering the study of large-scale phenomena. Simultaneously, advancements in data collection techniques and computational power lead to an explosion of available data from experiments and high-fidelity simulations. This abundance of data makes data-driven approaches increasingly appealing for studying complex dynamics, with machine learning becoming a dominant paradigm. While many data-driven methods excel at prediction, there remains a significant gap in approaches that offer interpretability, which is paramount for understanding why a system evolves in a certain way. This work introduces an encoder-only approach to learn evolution operators of large-linear dynamical systems, bridging self-supervised representation learning with the theory of evolution operators to provide interpretable insights into complex natural phenomena.

Results

Our method proposes an encoder-only approach for learning evolution operators that is based on self-supervised contrastive learning and scales effectively to large dynamical systems. This approach reveals a deep connection between evolution operator learning and contrastive self-supervised representation learning schemes. The core idea is to optimize a model for the density ratio using a bilinear form $\langle\varphi(x_{t}),P\varphi(x_{t+1})\rangle$, where $\varphi$ is a d-dimensional encoder and P is a linear predictor layer that approximates the action of the evolution operator E. Unlike traditional encoder-decoder schemes that minimize reconstruction errors and can be twice as large, our encoder-only method prioritizes approximating the spectral decomposition of E over raw forecasting performances, as the main advantages of evolution operators stem from their spectral decomposition. The loss function directly optimizes the $L^{2}$ error between the density ratio and our bilinear model, matching the negative VAMP-2 score when the predictor is optimal. Crucially, our approach avoids computationally unwieldy and unstable matrix inversions in the loss computation, which are common in other methods, instead relying on simple matrix multiplications that are efficient for GPU-based training. This makes the method broadly applicable, especially for interpretability and model reduction in scientific dynamical systems.

Protein folding dynamics

The Trp-Cage miniprotein, a widely studied benchmark for protein folding, is analyzed using our approach with a high-resolution molecular representation of all 144 heavy atoms, employing a SchNet graph neural network as the encoder. The leading eigenfunction $\Psi_{1}(x)=\langle q_{1},\varphi(x)\rangle$ strongly correlates with the system's root-mean-square deviation (RMSD) from the folded structure, confirming that $\Psi_{1}$ encodes the folding-to-unfolding transition. Clustering the molecular configurations according to the values of $\Psi_{1}$ clearly separates folded and unfolded ensembles. Analysis through a sparse LASSO model reveals a network of hydrogen bonds stabilizing the folded state, including contributions from side-chain interactions that previous coarse-grained models miss. The implied timescale $\tau_{1}$ derived from the leading eigenvalue of the learned operator is approximately $2.5~\mu s$, which is higher than the $2~\mu s$ timescale obtained by other methods, suggesting a better approximation of the true slow dynamics.

Ligand binding dynamics

Our method applies to ligand binding dynamics, specifically involving Calixarene host-guest systems. The eigenfunctions $\Psi_{1}$ and $\Psi_{2}$ effectively capture ligand transitions between unbound, semi-bound, and bound states. The model learns these dynamics both from scratch and by transferring representations from other ligands, with the latter closely matching the former. This demonstrates the method's capability to provide insights into complex molecular interaction processes.

Climate Patterns

In our experiment on climate patterns, we apply our method to analyze the El Niño-Southern Oscillation (ENSO), a major driver of global climate variability. Our approach aims to identify and characterize the spatio-temporal patterns associated with ENSO events. The leading eigenfunctions of the learned evolution operator are highly correlated to ENSO phenomenon. This application demonstrates the potential of our framework for extracting meaningful and interpretable patterns from high-dimensional climate data, offering a data-driven perspective to understanding complex Earth system dynamics.

@article{turri2025self, title={Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems}, author={Turri Giacomo, Bonati Luigi, Zhu Kai, Pontil Massimiliano, Novelli Pietro}, year={2025} }