Risk-sensitive Reinforcement Learning and Robust Learning for Control
Authors :
Conference : 2021 60th IEEE Conference on Decision and Control (CDC 2021) pp. 2976-2981 , Austin, TX
Date: December 13 - December 15, 2021
We develop new foundations for Robust Reinforcement Learning for control, by exploring analytically the relation between the KL-regularized Reinforcement Learning and the Risk-sensitive Control “exponential of integral” criteria. We establish that the maximization of the risk-sensitive exponential criterion is equivalent to maximizing the KL-regularized objective jointly over the policy and the reference policy parameters. We show that the iterative procedure for optimizing the KLregularized objective, by substituting the reference policy at each iteration with the optimal policy parameter obtained from the previous iteration, starting from some initial value, which is at the core of a number of well-known Reinforcement Learning algorithms, is an iterative approach for optimizing the risksensitive control exponential of integral criterion. We provide an interpretation of this iterative optimization procedure as the use of Minorization-Maximization (MM) algorithms. We offer a probabilistic interpretation of the iterative optimization procedure using Probabilistic Graphical Models to motivate the improved performance of such risk-sensitive objectives.
Download Full Paper