Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning

Artificial Intelligence - Tập 112 - Trang 181-211 - 1999
Richard S. Sutton1, Doina Precup2, Satinder Singh1
1AT&T Labs.-Research, 180 Park Avenue, Florham Park, NJ 07932, USA
2Computer Science Department, University of Massachusetts, Amherst, MA 01003 USA