MultiLevel Discovery of Deep Options
This topic contains 0 replies, has 1 voice, and was last updated by arXiv 1 year, 2 months ago.

MultiLevel Discovery of Deep Options
Augmenting an agent’s control with useful higherlevel behaviors called options can greatly reduce the sample complexity of reinforcement learning, but manually designing options is infeasible in highdimensional and abstract state spaces. While recent work has proposed several techniques for automated option discovery, they do not scale to multilevel hierarchies and to expressive representations such as deep networks. We present Discovery of Deep Options (DDO), a policygradient algorithm that discovers parametrized options from a set of demonstration trajectories, and can be used recursively to discover additional levels of the hierarchy. The scalability of our approach to multilevel hierarchies stems from the decoupling of lowlevel option discovery from highlevel metacontrol policy learning, facilitated by underparametrization of the high level. We demonstrate that using the discovered options to augment the action space of Deep QNetwork agents can accelerate learning by guiding exploration in tasks where random actions are unlikely to reach valuable states. We show that DDO is effective in adding options that accelerate learning in 4 out of 5 Atari RAM environments chosen in our experiments. We also show that DDO can discover structure in robotassisted surgical videos and kinematics that match expert annotation with 72% accuracy.
MultiLevel Discovery of Deep Options
by Roy Fox, Sanjay Krishnan, Ion Stoica, Ken Goldberg
https://arxiv.org/pdf/1703.08294v2.pdf
You must be logged in to reply to this topic.