CTML OTMLE
Reading Group

Wednesdays 2-4PM PT, BWW TBD

At UC Berkeley’s Center for Targeted Machine Learning and Causal Inference, our OTMLE (Optimal Transport and Targeted Maximum Likelihood Estimation) reading group explores the intersection of optimal transport theory and TMLE, offering a fresh perspective on how TMLE fluctuations of probability measures can be understood. The group covers key topics, including history of optimal transport, Wasserstein metrics, geodesics, gradient flows, statistical estimation, and information geometry. Each session focuses on one of these themes, providing participants with a comprehensive foundation to bridge optimal transport with statistical estimation techniques in TMLE.

We invite all enthusiasts, researchers, and practitioners—regardless of affiliation with the CTML—to join our reading group sessions. Your interest and contributions are highly valued, as we believe that a diverse community fosters richer discussions and deeper understanding. Whether you’re new to the field or have extensive experience, we welcome you to be part of our collaborative exploration of optimal transport and TMLE.

To stay informed about our reading group sessions and the latest developments at the CTML, we invite you to subscribe to both our reading group’s mailing list and the CTML newsletter. Joining these mailing lists ensures you receive timely updates on meeting schedules, discussion topics, and upcoming events.

Optimal Transport

References

Our weekly reading materials will be drawn from the following list, though it is not exhaustive. We have carefully hand-picked these resources to offer not only a comprehensive introduction to optimal transport theories but also to emphasize aspects that are potentially useful in relation to TMLE.

  • Agueh, M., & Carlier, G. (2011). Barycenters in the Wasserstein space. SIAM Journal on Mathematical Analysis, 43(2), 904-924.
  • Agueh, M., & Carlier, G. (2017). Vers un théorème de la limite centrale dans l’espace de Wasserstein?. Comptes Rendus. Mathématique, 355(7), 812-818.
  • Ambrosio, L., Gigli, N., & Savaré, G. (2008). Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media.
  • Benamou, J. D., & Brenier, Y. (2000). A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numerische Mathematik, 84(3), 375-393.
  • Chernozhukov, V., Galichon, A., Hallin, M., & Henry, M. (2017). Monge–Kantorovich depth, quantiles, ranks and signs.
  • Figalli, A., & Glaudo, F. (2021). An invitation to optimal transport, Wasserstein distances, and gradient flows.
  • Gibbs, A. L., & Su, F. E. (2002). On choosing and bounding probability metrics. International statistical review, 70(3), 419-435.
  • Jordan, R., Kinderlehrer, D., & Otto, F. (1998). The variational formulation of the Fokker–Planck equation. SIAM journal on mathematical analysis, 29(1), 1-17.
  • Panaretos, V. M., & Zemel, Y. (2019). Statistical aspects of Wasserstein distances. Annual review of statistics and its application, 6(1), 405-431.
  • Panaretos, V. M., & Zemel, Y. (2020). An invitation to statistics in Wasserstein space (p. 147). Springer Nature.
  • Peyré, G., & Cuturi, M. (2019). Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6), 355-607.
  • Santambrogio, F. (2015). Optimal transport for applied mathematicians. Birkäuser, NY, 55(58-63), 94.
  • Tsybakov, A. B. (2009). Lower bounds on the minimax risk. Introduction to Nonparametric Estimation, 77-135.
  • Villani, C. (2009). Optimal transport: old and new (Vol. 338, p. 23). Berlin: springer.
  • Villani, C. (2021). Topics in optimal transportation (Vol. 58). American Mathematical Soc..
  • Wainwright, M. J. (2019). High-dimensional statistics: A non-asymptotic viewpoint (Vol. 48). Cambridge university press.

Fall 2024: Foundations of Optimal Transport

The Fall 2024 semester introduces participants to the foundational concepts of optimal transport, covering the three primary formulations: Monge, Kantorovich, and Benamou-Brenier formulations. These are explored alongside their respective characterizations of the Wasserstein distance and the optimal transport plans that emerge in each framework. This exploration provides a comprehensive understanding of how optimal transport establishes metrics over probability spaces and how these relate to statistical estimation and hypothesis testing.

A key focus is understanding TMLE as a dynamic path in probability space, where optimal transport provides a spatial and geometric viewpoint. Participants examine how properties of optimal transport plans—such as monotonicity, duality, geodesics, and gradient flows—inform the theoretical underpinnings of TMLE. This semester emphasizes building a strong foundation and connecting the “moving mass” perspective of optimal transport to the iterative updates in TMLE.

[Introduction]

Date: September 25th, 2024

Presenter: Kaiwen Hou

Optional Reading: Villani (2021) Sections 0.1-0.3, 2.1-2.3.1

  • Purpose of the reading group and its role in advancing targeted learning
  • Logistics: meeting times, room assignments, and reading materials for the semester
  • Basic concepts: source measure, target measure, transport map, and pushforward
  • Monge’s formulation, existence, and uniqueness
  • Kantorovich’s relaxation and transport plan
  • Property of the optimal transport map: monotonicity
  • Optimal transport map implied by TMLE

Unresolved Questions:

  • Compactness of the coupling space
  • Weierstrass theorem: existence in Kantorovich’s formulation
  • Existence of suboptimal transport plan in proving monotonicty

[Geometry of Optimal Transport]

Date: October 2nd, 2024

Presenter: Kaiwen Hou

Reading: Villani (2021) Sections 2.2-2.3.2, 1.1.1-1.1.5; Santambrogio (2015) Box 1.1, Theorem 1.4

Optional Reading: Villani (2021) Sections 4.1, 1.1.6-1.2, 2.1.1-2.1.3; Santambrogio (2015) Section 1.2

  • Construction of optimal transport map
  • Cyclical monotonicity and Rockafellar’s theorem
  • Monge–Ampère equation
  • Existence of optimal transport plan in Kantorovich’s formulation

[Wasserstein Distances]

Date: October 9th, 2024

Presenter: Qiuran Lyu

Reading: Villani (2021) Sections 7.1, 7.4, Exercise 7.11

Optional Reading: Villani (2021) Sections 7.2-7.3; Engquist, Froese & Yang (2016) Theorem 5

  • Wasserstein metric: nonnegativity and symmetry
  • Gluing lemma to prove the triangle inequality
  • Proof of gluing lemma
  • Ordering and interpolation inequalities
  • Topological properties: robustness to oscillations
  • Convexity properties and behavior under rescaled convolution

[Statistical Inference Based on Wasserstein Distances]

Date: October 16th, 2024

Presenter: Wenxin Zhang

Reading: Villani (2021) Sections 2.1.5, 5.1.3; Agueh & Carlier (2011) Sections 1-3, 6; Panaretos & Zemel (2019) Sections 2.1, 3.1

Optional Reading: Villani (2021) Sections 5.2.1-5.2.2; Santambrogio (2015) Lemma 5.29, Proposition 5.32; Agueh & Carlier (2017); Panaretos & Zemel (2020) Wasserstein Barycenters

  • Properties of Wasserstein distances under shifts, scaling, and product measures
  • Subadditivity of Wasserstein distances w.r.t. convolutions
  • Wasserstein test statistics for empirical measures and/or two samples
  • Asymptotic distributions of Wasserstein test statistics under univariate measures
  • Wasserstein Fréchet mean of univariate location family: sufficient condition
  • Wasserstein Fréchet mean of two measures: displacement interpolation
  • Wasserstein Fréchet mean of Gaussian distributions is Gaussian

[Ten Metrics on Probability Measures]

Date: October 23rd, 2024

Presenter: Qiuran Lyu

Reading: Gibbs & Su (2002) Figure 1, Sections 2-3

Optional Reading: Peyré & Cuturi (2019) Sections 8.1-8.4; Tsybakov (2009) Section 2.4; Wainwright (2019) Chapter 15 Ten Metrics

  • Definitions
  • f-divergence
  • Metric inequalities and proof

[Monge–Kantorovich Depth]

Date: October 30th, 2024

Presenter: Yilong Hou

Reading: Villani (2021) Proposition 2.4, Theorem 2.9; Chernozhukov et al. (2017) Paragraphs “Notation, conventions and preliminaries”, “MK depth is halfspace depth in dimension 1”, Sections 2.3, 3.2-3.3, A, B3-4

Optional Reading: Duality and Double Convexification (scribed by Qiuran Lyu) Depth

[Euler Equation and Geodesics]

Date: November 6th, 2024

Presenter: Mingxun Wang

Reading: Figalli & Glaudo (2021) Sections 1.3, 2.5.4

Optional Reading: Villani (2021) Theorem 3.8, Sections 3.1-3.3; Notes

  • Basics of Riemannian geometry: tangent space, gradient, arc length parameterization, Riemannian distance, and geodesic
  • Incompressible Euler equation
  • Arnold’s geodesic interpretation: measure-preserving orientation-preserving diffeomorphism
  • Brenier’s approximate geodesics: midpoint projection onto closure
  • Polar factorization theorem
  • Helmholtz decomposition of differentiable vector fields into irrotational and solenoidal vector fields

[Benamou-Brenier Formulation (1)]

Date: November 13th, 2024

Presenter: Yi Li

Reading: Villani (2021) Sections 8.1-8.2

Optional Reading: Villani (2021) Sections 5.1, 8.3

  • Continuity equation: velocity field and Lagrangian specification of flow field
  • Benamou-Brenier formulation of Wasserstein distance: kinetic energy and action functional
  • Otto’s calculus and interpretation

[Variational Formulation of Fokker-Planck Equation]

Date: November 20th, 2024

Presenter: Yi Li

Reading: Jordan, Kinderlehrer & Otto (1998) Sections 1-2, 4-5

Optional Reading: Villani (2021) Sections 8.4-8.5; Ambrosio, Gigli & Savaré (2008) Definition 3.1.1

  • Fokker-Planck equation: unique stationary solution as the steepest descending direction
  • Gradient flows, JKO scheme, and minimimizing movement
  • L1-weak convergence of interpolated JKO process to the solution of Fokker-Planck equation
  • Connections between Wasserstein gradient flows and Benamou-Brenier formulation

[Continuity Equation in the Sense of Distributions]

Date: November 27th, 2024

Presenter: Mingxun Wang

Reading: Ambrosio, Gigli & Savaré (2008) Section 8.1

Optional Reading: Ambrosio, Gigli & Savaré (2008) Sections 1.1, 10.0-10.1; Continuity Equation and Benamou-Brenier Formulation; Divergence Theorem; Gradient Flows; Brenier ODE

  • Divergence theorem
  • Bounded variation, rectifiable curve, geodesic, metric derivative, and arc-length reparameterization
  • Distribution: integration by parts, test function, local integrability
  • Weak derivative and Sobolev space
  • Continuity equation and weak solution

[Gradient Flows]

Date: December 4th, 2024

Presenter: Kaiwen Hou

Reading: Ambrosio, Gigli & Savaré (2008) Sections 8.3-8.4, 11.1

Optional Reading: Ambrosio, Gigli & Savaré (2008) Example 11.1.10, Definitions 5.1.11, 10.1.1, Theorem 8.3.1, Lemma 10.4.1

  • Quantum drift-diffusion equation as gradient flow of the Fisher information
  • Four approaches to Wasserstein gradient flows: variational approximation scheme, curves of maximal slope, pointwise differential formulation, and systems of evolution variational inequalities
  • Duality map: Fréchet differential of Lp norm, compatibility with norm, and compatibility with inner product
  • Tangent bundle and smooth cylindrical test functions
  • Gradient flow equation: Fréchet subdifferential in Wasserstein space and differential inclusion
  • Variational integral lemma: strong subdifferential is the gradient of first variation
  • Gradient flow example: evolutionary parabolic PDEs of diffusion type

Spring 2025: Geometry of Probability Space Optimization

In Spring 2025, participants further investigate the geometry of probability spaces and the implications for TMLE’s structure and behavior. Topics include deeper explorations of how optimal transport’s spatial and dynamic properties provide insights into likelihood-based optimization and its role in semiparametric models. Rather than diving into specific optimization techniques like natural gradient descent or Newton’s method, this semester focuses on laying the theoretical groundwork for understanding such methods in probability spaces. Participants refine their understanding of how probability space-based optimization differs fundamentally from traditional parameter space approaches. This exploration highlights the theoretical richness of TMLE’s operations in probability space and prepares participants to extend these ideas to advanced methods and practical implementations in their future work.

[Benamou-Brenier Formulation (2)]

Date: January 22nd, 2025

Presenter: Qiuran Lyu

Computational Reading: Peyré & Cuturi (2019) Sections 7.1, 7.6, Remark 2.30

Optional Reading: Benamou & Brenier (2000)

  • Convex formulation using momentum
  • Connections with displacement interpolation
  • Dynamic formulation over the paths space: displacement interpolation and entropic interpolation

[Otto Calculus]

Date: January 29th, 2025

Presenter:

Theoretical Reading: Villani (2009) Formulas 15.2, 15.7

  • Gradient formula in Wasserstein space
  • Hessian formula in Wasserstein space

[Tangent Bundle]

Date: February 5th, 2025

Presenter:

Theoretical Reading: Ambrosio, Gigli & Savaré (2008) Section 8.0, Equations 0.20-0.26, Definition 8.4.1, Lemma 8.4.2, Propositions 8.4.3-8.4.5

[Tangent Space, Cotangent Space, and Optimal Maps]

Date: February 12th, 2025

Presenter:

Theoretical Reading: Ambrosio, Gigli & Savaré (2008) Section 8.5, Remark 8.4.4

[Displacement Convexity (1)]

Date: February 19th, 2025

Presenter:

Theoretical Reading: Ambrosio, Gigli & Savaré (2008) Section 9.0-9.2

[Subdifferential Calculus (1)]

Date: February 26th, 2025

Presenter:

Theoretical Reading: Ambrosio, Gigli & Savaré (2008) Sections 10.1-10.3

[Subdifferential Calculus (2)]

Date: March 5th, 2025

Presenter:

Theoretical Reading: Ambrosio, Gigli & Savaré (2008) Section 10.4

[Monge–Ampère Equation]

Date: March 12th, 2025

Presenter: Kaiwen Hou

Reading:

[Linearization of the Optimal Transport Problem]

Date: March 19th, 2025

Presenter:

Reading:

[Second Variation]

Date: April 2nd, 2025

Presenter:

Reading:

[Hessians and Convexity]

Date: April 9th, 2025

Presenter:

Reading:

[Examples of Functionals with Known Hessians]

Date:

Presenter:

Reading:

[Regularity Theory from Hessians]

Date:

Presenter:

Reading:

[Displacement Convexity (2)]

Date:

Presenter:

Reading:

[Spectral Analysis of Hessian Operators]

Date:

Presenter:

Reading:

[First-Order and Second-Order Theories of TMLE]

Date:

Presenter:

Reading:

Join us on Zoom if you can’t attend in person, and don’t forget to subscribe to this channel for access to the recordings.