CAM Colloquium: Suriya Gunasekar (Toyota Technological Institute) - Implicit bias of optimization in learning

Location

Frank H. T. Rhodes Hall 655

Description

Abstract: Large scale neural networks used in practice are highly over-parameterized with far more trainable model parameters compared to the number of training examples. Consequently, the optimization objectives for learning such high capacity models have many global minima that fit training data perfectly. In such problems, minimizing the training loss using specific optimization algorithms return some special global minima, and hence implicitly biases the properties of the learned model, including its generalization performance on the test data. Thus, understanding the implicit bias of different algorithms is essential for understanding how and what overparameterized models learn. In this talk, we will look at an overview of recent results on the implicit bias of common optimization algorithms such as gradient descent, generalized mirror descent, and generic steepest descent. In many underdetermined learning problems, including linear regression, logistic regression, and deep linear networks, the specific global minima reached by optimization algorithms can indeed be succinctly characterized by the geometry of the algorithm and the parameterizations of the model, and independently of hyperparameter choices such as step size and momentum.

Bio: Suriya Gunasekar is a research assistant professor at the Toyota Technological Institute at Chicago. Her research focuses on optimization and machine learning. Prior to joining TTIC, Suriya finished her Ph.D. at the University of Texas at Austin advised by Prof. Joydeep Ghosh.