Stable Nonconvex-Nonconcave Training via Linear Interpolation: Conclusion & limitations

Written by interpolation | Published 2024/03/07
Tech Story Tags: linear-interpolation | nonexpansive-operators | rapp | cohypomonotone-problems | lookahead-algorithms | rapp-and-lookahead | training-gans | nonmonotone-class

TLDRThis paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training.via the TL;DR App

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Thomas Pethick, EPFL (LIONS) [email protected];

(2) Wanyun Xie, EPFL (LIONS) [email protected];

(3) Volkan Cevher, EPFL (LIONS) [email protected].

Table of Links

9 Conclusion & limitations

We have precisely characterized the stabilizing effect of linear interpolation by analyzing it under cohypomonotonicity. We proved last iterate convergence rates for our proposed method RAPP. The algorithm is double-looped, which introduces a log factor in the rate as mentioned in Remark E.4. It thus remains open whether last iterate is possible using only τ = 2 inner iterations (for which RAPP reduces to EG+ in the unconstrained case). By replacing the inner solver we subsequently rediscovered and analyzed Lookahead using nonexpansive operators. In that regard, we have only dealt with compositions of operators. It would be interesting to further extend the idea to understanding and developing both Federated Averaging and the meta-learning algorithm Reptile (of which Lookahead can be seen as a single client and single task instance respectively), which we leave for future work.


Written by interpolation | #1 Publication focused exclusively on Interpolation, ie determining value from the existing values in a given data set.
Published by HackerNoon on 2024/03/07