Adaptively Perturbed Mirror Descent for Learning in Games

tl;dr

This paper proposes a novel variant of Mirror Descent that achieves last-iterate convergence.

Installation

pip install -r requirements.txt

Run Experiments

In order to investigate the performance of APMD in Three-Player Biased Rock-Paper-Scissors with full feedback, execute the following command:

# D_{psi}=KL G=KL
$ python main.py feedback=full n_trials=10 T=100000 game=three_biased_rps algorithm=APMD algorithm.learning_rate=0.1 algorithm.perturbation_strength=0.1 algorithm.random_init=True algorithm.regularizer=entropy algorithm.perturbation_divergence=kl algorithm.update_slingshot_freq=100
# D_{psi}=KL G=Reverse KL
$ python main.py feedback=full n_trials=10 T=100000 game=three_biased_rps algorithm=APMD algorithm.learning_rate=0.1 algorithm.perturbation_strength=0.1 algorithm.random_init=True algorithm.regularizer=entropy algorithm.perturbation_divergence=reverse_kl  algorithm.update_slingshot_freq=100
# D_{psi}=Squared L2 G=Squared L2
$ python main.py feedback=full n_trials=10 T=100000 game=three_biased_rps algorithm=APMD algorithm.learning_rate=0.1 algorithm.perturbation_strength=1.0 algorithm.random_init=True algorithm.regularizer=l2 algorithm.perturbation_divergence=l2 algorithm.update_slingshot_freq=20

To evaluate APMD via an experiment in Three-Player Biased Rock-Paper-Scissors with noisy feedback, execute the following command:

# D_{psi}=KL G=KL
$ python main.py feedback=noisy n_trials=10 T=100000 game=three_biased_rps algorithm=APMD algorithm.learning_rate=0.01 algorithm.perturbation_strength=0.1 algorithm.random_init=True algorithm.regularizer=entropy algorithm.perturbation_divergence=kl algorithm.update_slingshot_freq=1000
# D_{psi}=KL G=Reverse KL
$ python main.py feedback=noisy n_trials=10 T=100000 game=three_biased_rps algorithm=APMD algorithm.learning_rate=0.01 algorithm.perturbation_strength=0.1 algorithm.random_init=True algorithm.regularizer=entropy algorithm.perturbation_divergence=reverse_kl algorithm.update_slingshot_freq=1000
# D_{psi}=Squared L2 G=Squared L2
$ python main.py feedback=noisy n_trials=10 T=100000 game=three_biased_rps algorithm=APMD algorithm.learning_rate=0.01 algorithm.perturbation_strength=1.0 algorithm.random_init=True algorithm.regularizer=l2 algorithm.perturbation_divergence=l2 algorithm.update_slingshot_freq=200

Reference

Kenshi Abe, Kaito Ariu, Mitsuki Sakamoto, and Atsushi Iwasaki. Adaptively perturbed mirror descent for learning in games. In ICML, 2024

Bibtex:

@inproceedings{
abe2024adaptively,
  title={Adaptively Perturbed Mirror Descent for Learning in Games},
  author={Abe, Kenshi and Ariu, Kaito and Sakamoto, Mitsuki and Iwasaki, Atsushi},
  booktitle={ICML},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
algorithms		algorithms
conf		conf
games		games
runner		runner
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptively Perturbed Mirror Descent for Learning in Games

tl;dr

Installation

Run Experiments

Reference

About

Releases

Packages

Languages

License

CyberAgentAILab/adaptively-perturbed-md

Folders and files

Latest commit

History

Repository files navigation

Adaptively Perturbed Mirror Descent for Learning in Games

tl;dr

Installation

Run Experiments

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages