This repository establishes the first comprehensive benchmark efforts of existing learning to optimize (L2O) approaches on a number of problems and settings. We release our software implementation and data as the Open-L2O package, for reproducible research and fair benchmarking in the L2O field. [Paper]
L2O (Learning to optimize) aims to replace manually designed analytic optimization algorithms (SGD, RMSProp, Adam, etc.) with learned update rules.
L2O serves as functions that can be fit from data. L2O gains experience from training optimization tasks in a principled and automatic way.
L2O is particularly suitable for solving a certain type of optimization over a specific distribution of data repeatedly. In comparison to classic methods, L2O is shown to find higher-quality solutions and/or with much faster convergence speed for many problems.
- There are significant theoretical and practicality gaps between manually designed optimizers and existing L2O models.
All codes are available at here.
- LISTA (feed-forward form) from Learning fast approximations of sparse coding [Paper]
- LISTA-CP from Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and Thresholds [Paper]
- LISTA-CPSS from Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and Thresholds [Paper]
- LFISTA from Understanding Trainable Sparse Coding via Matrix Factorization [Paper]
- LAMP from AMP-Inspired Deep Networks for Sparse Linear Inverse Problems [Paper]
- ALISTA from ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA [Paper]
- GLISTA from Sparse Coding with Gated Learned ISTA [Paper]
- L2O-DM from Learning to learn by gradient descent by gradient descent [Paper] [Code]
- L2O-RNNProp Learning Gradient Descent: Better Generalization and Longer Horizons from [Paper] [Code]
- L2O-Scale from Learned Optimizers that Scale and Generalize [Paper] [Code]
- L2O-enhanced from Training Stronger Baselines for Learning to Optimize [Paper] [Code]
- L2O-Swarm from Learning to Optimize in Swarms [Paper] [Code]
- L2O-Jacobian from HALO: Hardware-Aware Learning to Optimize [Paper] [Code]
- L2O-Minmax from Learning A Minimax Optimizer: A Pilot Study [Paper] [Code]
Convex Functions:
- Quadratic
- Lasso
Non-convex Functions:
- Rastrigin
Minmax Functions:
- Saddle
- Rotated Saddle
- Seesaw
- Matrix Game
Neural Networks:
- MLPs on MNIST
- ConvNets on MNIST and CIFAR-10
- LeNet
- NAS searched archtectures
- This is a Pytorch implementation of L2O-DM. [Code]
- This is the original L2O-Swarm repository. [Code]
- This is the original L2O-Jacobian repository. [Code]
Our experiments were conducted on a cluster with two GPUs (GeForce RTX 3080) and a 14-core CPU (Intel(R) Core(TM) i9-9940X).
- TF2.0 Implementated toolbox v2 with a unified framework and lib dependency.
@misc{chen2021learning,
title={Learning to Optimize: A Primer and A Benchmark},
author={Tianlong Chen and Xiaohan Chen and Wuyang Chen and Howard Heaton and Jialin Liu and Zhangyang Wang and Wotao Yin},
year={2021},
eprint={2103.12828},
archivePrefix={arXiv},
primaryClass={math.OC}
}