HIGH-DIMENSIONAL CONTINUOUS CONTROL USING GENERALIZED ADVANTAGE ESTIMATIONPublished as a conference paper at ICLR 2016
Here, the subscr ipt of e enumerates the variables being integrated over, where states and actions are
sampled sequentially from th