Risk-sensitive REINFORCE: A Monte Carlo Policy Gradient Algorithm for Exponential Performance Criteria

Risk-sensitive REINFORCE: A Monte Carlo Policy Gradient Algorithm for Exponential Performance Criteria

Title : Risk-sensitive REINFORCE: A Monte Carlo Policy Gradient Algorithm for Exponential Performance Criteria
Authors :
No items found
Conference : 2021 60th IEEE Conference on Decision and Control (CDC 2021) pp. 1522-1527 , Austin, TX
Date: December 13 - December 15, 2021

Risk is an inherent component of any decision making process under uncertain conditions, and failure to consider risk may lead to significant performance degradation.  We present a policy gradient theorem for the Risk-sensitive Control “exponential of integral” criteria, and propose a risk-sensitive Monte Carlo policy gradient algorithm. Our simulations, together with our theoretical analysis, show that the use of the exponential criteria with an appropriately chosen risk parameter not only results in a risk-sensitive policy, but also reduces variance during learning process and accelerates learning, which in turn results in a policy with higher expected return— that is to say, risk-sensitiveness leads to sample efficiency and improved performance.

Download Full Paper