Ultra-scale Machine Learning Algorithms with Theoretical Foundations

Talk
Heng Huang
Talk Series: 
Time: 
05.03.2022 11:00 to 12:00

Machine learning and artificial intelligence are gaining fresh momentum, and have helped us enhance not only many industrial and professional processes but also our everyday living. The recent success of machine learning heavily relies on the surge of big data, big models, and big computing. However, the inefficient algorithms often restrict the applications of machine learning to very large-scale tasks. In terms of big data, serious concerns, such as communication overhead and convergence speed, should be rigorously addressed when we train learning models using large amounts of data located at multiple computers or devices. In terms of the big model, it is still an underexplored research area if a model is too big to train on a single computer or device. To address these challenging problems, we focused on designing new ultra-scale machine learning algorithms, efficiently optimizing and training models for big data problems, and studying new discoveries in both theory and applications. For the challenges raised by big data, we proposed multiple new asynchronous distributed stochastic gradient descent, coordinate descent, zeroth-order methods with variance reduction acceleration for efficiently solving convex and non-convex problems with faster convergence rate. We also designed new momentum fusion based algorithm with theoretical analysis for communication-efficient federated learning. For the challenges raised by the big model, we scaled up the deep learning models by parallelizing the layer-wise computations with a theoretical guarantee, which is the first algorithm (also theoretical guarantee on convergence) breaking the lock of backpropagation mechanism such that the large-scale deep learning models can be dramatically accelerated.