Update: 23/07/2018_15:12:16
  1. Meta-Reinforcement Learning of Structured Exploration Strategies
  2. Learning Robust Rewards with Adverserial Inverse Reinforcement Learning
  3. Neural Combinatorial Optimization with Reinforcement Learning
  4. Improving Policy Gradient by Exploring Under-appreciated Rewards
  5. Deep Reinforcement Learning with a Natural Language Action Space