Most algorithms for convex optimization use the same basic idea: find a path to the global minimum using gradient information. The reliability of this path is tied to the inner product used to define the gradient. The popular primal-dual algorithm of Chambolle and Pock (PDHG) uses the L^2 inner product. For problems involving differential operators, the L^2 gradient is unstable and thus, PDHG requires vanishing step sizes.
In this talk, we will show that by choosing an appropriate inner product, the convergence rate PDHG can be considerably improved. For well-behaved problems this is well known. Our focus is on two unruly problems: total variation denoising and the Wasserstein distance. In these problems, the objective functional fails to be coercive in the norm induced by the appropriate inner product. As a result, the minimizers of the problem may be "at infinity". Despite this pathology, we show that the algorithm still converges to the global minimizer, the convergence rate of the discrete problem is independent of the grid size, and we give a principled strategy for choosing optimal PDHG step size parameters.
This talk is joint work with Flavien Leger, Wuchen Li and Stan Osher.