Configurable mirror descent | Proceedings of the 41st International Conference on Machine Learning (2025)

2025-01-06 09:07

research-article

AUTHORs: Pengdeng Li, Shuxin Li, Chang Yang, Xinrun Wang, + 4, Shuyue Hu, Xiao Huang, Hau Chan, Bo An (Less)

ICML'24: Proceedings of the 41st International Conference on Machine Learning

Article No.: 1130, Pages 28146 - 28203

Published: 03 January 2025 Publication History

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

Manage my Alerts

New Citation Alert!

Please log in to your account

Publisher Site

ICML'24: Proceedings of the 41st International Conference on Machine Learning
Configurable mirror descent: towards a unification of decision making
Pages 28146 - 28203
PREVIOUS CHAPTERLearning shadow variable representation for treatment effect estimation under collider biasPreviousNEXT CHAPTEREnhancing class-imbalanced learning with pre-trained guidance through class-conditional knowledge distillationNext
- Abstract
- References
- View Options
- References
- Media
- Tables
- Share

Abstract

Decision-making problems, categorized as single-agent, e.g., Atari, cooperative multi-agent, e.g., Hanabi, competitive multi-agent, e.g., Hold'em poker, and mixed cooperative and competitive, e.g., football, are ubiquitous in the real world. Although various methods have been proposed to address the specific decision-making categories, these methods typically evolve independently and cannot generalize to other categories. Therefore, a fundamental question for decision-making is: Can we develop a single algorithm to tackle ALL categories of decision-making problems? There are several main challenges to address this question: i) different categories involve different numbers of agents and different relationships between agents, ii) different categories have different solution concepts and evaluation measures, and iii) there lacks a comprehensive benchmark covering all the categories. This work presents a preliminary attempt to address the question with three main contributions. i) We propose the generalized mirror descent (GMD), a generalization of MD variants, which considers multiple historical policies and works with a broader class of Bregman divergences. ii) We propose the configurable mirror descent (CMD) where a meta-controller is introduced to dynamically adjust the hyperparameters in GMD conditional on the evaluation measures. iii) We construct the GAMEBENCH with 15 academic-friendly games across different decision-making categories. Extensive experiments demonstrate that CMD achieves empirically competitive or better outcomes compared to baselines while providing the capability of exploring diverse dimensions of decision making.

References

[1]

Albrecht, J., Fetterman, A., Fogelman, B., Kitanidis, E., Wroblewski, B., Seo, N., Rosenthal, M., Knutins, M., Polizzi, Z., Simon, J., and Qiu, K. Avalon: A benchmark for RL generalization using procedurally generated worlds. In NeurIPS Datasets and Benchmarks Track, pp. 12813-12825, 2022.

Configurable mirror descent | Proceedings of the 41st International Conference on Machine Learning (2025)

New Citation Alert added!

New Citation Alert!

Abstract

References

Index Terms

Recommendations

Comments

Information & Contributors

Information

Published In

Publisher

Publication History

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Recommended Articles