Code the ICML 2024 paper: "MADA: Meta-Adaptive Optimizers through hyper-gradient Descent"
machine-learning
deep-neural-networks
optimization
machine-learning-algorithms
optimization-algorithms
adam-optimizer
gpt-2
meta-optimizer
large-language-models
-
Updated
Jul 3, 2024 - Python