Winter School: Data Science, Optimization and Operations Research

Niao He - Reinforcement Learning: Optimization and Dynamical Systems Perspectives

Reinforcement learning (RL) has been in the limelight with many recent breakthroughs in artificial intelligence, yet theoretical understanding of many classical RL algorithms and their modern deep RL counterparts remain quite limited. Recently, fundamental concepts from optimization and dynamical systems theory have provided fresh perspectives to understanding their convergence behaviors as well as developing provably efficient RL algorithms.

This short course will introduce some recent theoretical and algorithmic developments in RL from the optimization and dynamical systems perspectives. Tentative topics include

Overview of RL basics (dynamic programming, TD-learning, Q-learning, policy gradient, etc.)
Unified O.D.E. analysis of RL algorithms (TD-learning and Q-learning variants)
Stochastic optimization frameworks for RL
Primal-dual optimization frameworks for RL
Optimization for RL from a single agent to cooperative agents
Extension to the theory of deep RL
Open problems

Papers for the workshops (click to access):

Click here to access the slides

Yurii Nesterov - Modern Theory of Second-Order Methods

In this course we present the latest achievements in the theory of high-order schemes. We start from the global complexity estimates for the second-order methods, based on then cubic regularization of the quadratic model of the objective function. We consider several complexity bounds for different classes of nonconvex functions. Next topic is the accelerated second-order methods for convex functions and the lower complexity bounds. After that, we pass to the universal second order schemes. And we finish the course with implementable tensor methods.

Papers for the workshops (click to access):

Lecture material

Slides lecture 1

Slides lecture 2

Slides lecture 3

Slides lecture 4

Schedule

On Sunday, a welcome cocktail is organized at 18:30 in the main restaurant of the hotel Europe.

Two 90 minute lectures are organized every morning.

On Monday and Tuesday, workshops are organized to work on papers of the two lecturers. See details here.
On Wednesday and Thursday, the groups present the papers prepared during the workshops.
On Friday, we adjourn the winter school at 12:00.
On Thursday evening, the workshop dinner is organized at 19:30.

	Monday	Tuesday	Wednesday	Thursday	Friday
08:30-10:00	Yurii	Niao	Yurii	Niao	Yurii
10:00-10:30	Coffee break
10:30-12:00	Niao	Yurii	Niao	Yurii	Niao
12:00-17:00	Sport and discussions
17:00-19:00	Workshops		Students presentations
19:30	Dinner

Each presentation has a slot of 40 minutes, distributed as follows:

25 minutes presentation,
15 minutes questions and discussions.

Wednesday, 17:00-17:40

Group Y1:

Jusup Matej (ETH Zurich),
Qorbanian Roozbeh (University of Luxembourg),
Li Mengmeng (EPFL, RAO),
Kang Zhao (TU Eindhoven),
Schnidrig Jonas (EPFL, IPESE).

Yu. Nesterov, B. Polyak. Cubic regularization of Newton's method and its global performance. Mathematical Programming, 108(1), 177-205 (2006). https://link.springer.com/article/10.1007/s10107-006-0706-8

Wednesday, 17:40-18:20

Group N1:

Kamoutsi Angeliki (ETH Zuerich),
Abdolhamidi Dorsa (EPFL, OES),
Krawczuk Igor (EPFL, LIONS),
Gajda Mikele (University of Lausanne),
Behmandpoor Pourya (KU Leuven).

Junchi Yang, Siqi Zhang, Negar Kiyavash, Niao He. "A catalyst framework for minimax optimization." Advances in Neural Information Processing Systems, 2020. https://proceedings.neurips.cc/paper/2020/file/3db54f5573cd617a0112d35dd1e6b1ef-Paper.pdf

Thursday, 17:00-17:40

Group Y2:

Rychener Yves (EPFL, RAO),
Tschernutter Daniel (ETH Zurich),
Han Jun (EPFL, OES),
Gender Alexander (ETH Zurich).

Yu. Nesterov. Accelerating the cubic regularization of Newton's method on convex problems. Mathematical Programming, 112(1) 159-181 (2008) https://link.springer.com/article/10.1007/s10107-006-0089-x

Thursday, 17:40-18:20

Group N2:

Taskesen Bahar (EPFL, RAO),
Schneider Philipp (EPFL, OES),
Demiray Onur (EPFL, TOM),
Lepour Dorsan (EPFL, IPESE).

Donghwan Lee and Niao He. "Target-Based Temporal Difference Learning." International Conference on Machine Learning, PMLR 97:3713-3722, 2019. http://proceedings.mlr.press/v97/lee19a/lee19a.pdf

~~Winter~~ Summer School: Data Science, Optimization and Operations Research
August 15-20, 2021
Hotel Europe, Zinal

Niao He		Yurii Nesterov
ETHZ		Catholic University of Louvain, Belgium
Organization: Michel Bierlaire
Contact: Mila Bender

Winter Summer School: Data Science, Optimization and Operations Research