Distributed Learning Algorithm for Multi agent Welfare Optimization
Baras, John, S.
January 01, 2016
A multi-agent system comprising N non-strategic agents is considered. Each agent picks actions from a finite set and receives a payoff that depends on the actions of the whole. The setting is non-model based where the exact form of the utility (or payoff) functions are unknown, and only their values can be measured by the respective agents. A distributed algorithm which enables the agents to learn to play welfare (i.e. the sum of individual utilities) optimal actions is proposed. While the exact form of the utilities is unknown, a novel framework is developed to capture implicit communication—the measured change in utility by an agent due to actions by another—and convergence of the algorithm is proved under appropriate assumptions on implicit and explicit communications. The proposed algorithm has kinship with a body of work in learning in repeated games and the main analysis tool is perturbed Markov chains.