Seminar of Computer Learning - Peter Richtárik (22.10.2021)
Friday 22.10.2021 at 15:30, Lecture room C (online too)
By: Tomáš Vinař
Peter Richtárik (KAUST, Computer and Mathematical Science and Engineering Division):
EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback
Lecture room C, FMFI UK (mode - vaccinated ) or on-line MS Teams
Error feedback (EF), also known as error compensation, is an immensely popular convergence stabilization mechanism in the context of distributed training of supervised machine learning models enhanced by the use of contractive communication compression mechanisms, such as Top-k. First proposed by Seide et al (2014) as a heuristic, EF resisted any theoretical understanding until recently [Stich et al., 2018, Alistarh et al., 2018]. However, all existing analyses either i) apply to the single node setting only, ii) rely on very strong and often unreasonable assumptions, such global boundedness of the gradients, or iterate-dependent assumptions that cannot be checked a-priori and may not hold in practice, or iii) circumvent these issues via the introduction of additional unbiased compressors, which increase the communication cost. In this work we fix all these deficiencies by proposing and analyzing a new EF mechanism, which we call EF21, which consistently and substantially outperforms EF in practice. Our theoretical analysis relies on standard assumptions only, works in the distributed heterogeneous data setting, and leads to better and more meaningful rates. In particular, we prove that EF21 enjoys a fast O(1/T) convergence rate for smooth nonconvex problems, beating the previous bound of O(1/T2/3), which was shown using a (limiting) bounded gradients assumption. We further improve this to a fast linear rate for PL functions, which is the first linear convergence result for an EF-type method not relying on unbiased compressors. Since EF has a large number of applications where it reigns supreme, we believe that our 2021 variant, EF21, can have a large impact on the practice of communication efficient distributed learning.
Peter Richtarik is a professor of Computer Science at the King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia, where he leads the Optimization and Machine Learning Lab. Prior to joining KAUST, he was an Associate Professor of Mathematics at the University of Edinburgh, and held postdoctoral and visiting positions at Université Catholique de Louvain, Belgium, and University of California, Berkeley, USA, respectively. He received his PhD in 2007 from Cornell University, USA, and before this, an MS degree in Mathematics in 2001 from Comenius Unviersity, Slovakia. Prof Richtarik’s research interests lie at the intersection of mathematics, computer science, machine learning, optimization, numerical linear algebra, and high-performance computing. Prof Richtárik’s works attracted international awards, including a Best Paper Award at the NeurIPS 2020 Workshop on Scalability, Privacy, and Security in Federated Learning (joint with S. Horvath), Distinguished Speaker Award at the 2019 International Conference on Continuous Optimization, SIAM SIGEST Best Paper Award (joint with O. Fercoq), and the IMA Leslie Fox Prize (second prize, three times, awarded to two of his students and a postdoc).