Update: 23/07/2018_15:12:16
- Meta-Reinforcement Learning of Structured Exploration Strategies
- Learning Robust Rewards with Adverserial Inverse Reinforcement Learning
- Neural Combinatorial Optimization with Reinforcement Learning
- Improving Policy Gradient by Exploring Under-appreciated Rewards
- Deep Reinforcement Learning with a Natural Language Action Space