Many methods for statistical inference and generative modeling rely on a probability divergence to effectively compare two probability distributions. In that context, the Wasserstein distance has been an interesting choice, but suffers from important computational and statistical limitations on large-scale settings. Several alternatives have then been proposed, including the Sliced-Wasserstein distance (SW), a metric that has been increasingly used in practice due to its computational benefits. However, there is little work regarding its theoretical properties. In this talk, we will further explore the use of SW in modern statistical and machine learning problems, with a twofold objective: 1) provide new theoretical insights to understand in depth SW-based algorithms, 2) design novel tools inspired by SW to improve its applicability and scalability. We first prove a set of asymptotic properties on the estimators obtained by minimizing SW, as well as a central limit theorem whose convergence rate is dimension-free. We also design a novel likelihood-free approximate inference method based on SW, which is theoretically grounded and scales well with the data size and dimension. Given that SW is commonly estimated with a simple Monte Carlo algorithm, we then propose two approaches to alleviate the inefficiencies caused by the induced approximation error. Finally, we define the general class of sliced probability divergences and investigate their topological and statistical properties to demonstrate the benefits of the slicing operator.
Kimia Nadjahi
Date et heure
-
Ecole de recherche (part. 2) : Divergences statistiques et géométriques pour l'apprentissage machine