Distributional Autoencoders Know the Score

Andrej Leban

Distributional Autoencoders Know the Score

Andrej Leban

February, 2025

The encoding for the Müller-Brown potential

Abstract

The Distributional Principal Autoencoder (DPA) combines distributionally correct reconstruction with principal-component-like interpretability of the encodings. In this work, we provide exact theoretical guarantees on both fronts. First, we derive a closed-form relation linking each optimal level-set geometry to the data-distribution score. This result explains DPA’s empirical ability to disentangle factors of variation of the data, as well as allows the score to be recovered directly from samples. When the data follows the Boltzmann distribution, we demonstrate that this relation yields an approximation of the minimum free-energy path for the Müller–Brown potential in a single fit. Second, we prove that if the data lies on a manifold that can be approximated by the encoder, latent components beyond the manifold dimension are conditionally independent of the data distribution - carrying no additional information - and thus reveal the intrinsic dimension. Together, these results show that a single model can learn the data distribution and its intrinsic dimension with exact guarantees simultaneously, unifying two longstanding goals of unsupervised learning.

Type

Publication

NeurIPS 2025

The Distributional Principal Autoencoder (DPA) (Shen and Meinshausen, 2024) is a recently introduced class of autoencoders that combines a deterministic encoder with a stochastic decoder trained to reconstruct the full conditional distribution associated with each encoding. Jointly optimizing successive encoding widths produces a principal- component-like ordering of the latent coordinates. Our paper establishes the theoretical structure underlying this method:

the encoder level sets align exactly with the data score in the normal directions, and
extra latent dimensions beyond the data manifold become completely uninformative, revealing the intrinsic dimension.

These hold simultaneously and circumvent the usual reconstruction/disentanglement trade-off in unsupervised learning. Thus, we extend the analogy to PCA made in the original DPA work by proving that, instead of finding principal linear subspaces, DPA learns nonlinear manifolds shaped locally by the data density, with a clear, testable dimensionality criterion — conditional independence.

The first result also leads to strong performance on molecular simulation data: when the data follow a Boltzmann distribution, the learned encoding aligns with the (unknown) force field. We demonstrate that this allows the method to recover an approximation of the minimum free-energy path for the Müller–Brown potential (a common benchmark) in a single fit, with the potential to speed up chemical simulations.

Generative Models Unsupervised Learning Autoencoders Manifold Learning AI4Science Score Matching

Distributional Autoencoders Know the Score

Abstract

Andrej Leban

Ph.D. Student