搜索结果

约 477,675 条结果

  1. Reinforcement Learning: An Introduction

    IEEE Transactions on Neural Networks - 2005
    Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their d…
    被引用次数:25,696
  2. Human-level control through deep reinforcement learning

    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu - Nature - 2015
    该记录暂无摘要,您可以通过来源链接查看详细信息。
    被引用次数:28,643
  3. Reinforcement Learning: An Introduction

    Richard S. Sutton, Andy Barto - IEEE Transactions on Neural Networks - 1998
    该记录暂无摘要,您可以通过来源链接查看详细信息。
    被引用次数:26,704
  4. Reinforcement Learning: A Survey

    Leslie Pack Kaelbling, Michael L. Littman, Andrew Moore - Journal of Artificial Intelligence Research - 1996
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work descri…
    被引用次数:8,640
  5. Introduction to Reinforcement Learning

    Richard S. Sutton, Andrew G. Barto - MIT Press eBooks - 1998
    From the Publisher: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.
    被引用次数:6,868
  6. Reinforcement Learning: An Introduction

    Jeffrey D. Johnson, Jinghong Li, Zengshi Chen - Neurocomputing - 2000
    该记录暂无摘要,您可以通过来源链接查看详细信息。
    被引用次数:8,674
  7. Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)

    John Schulman, Filip Wolski, Prafulla Dhariwal - arXiv (Cornell University) - 2017
    Due to the safety risks and training sample inefficiency, it is often preferred to develop controllers in simulation. However, minor differences between the simulation and the real world can cause a significant sim-to-real gap. This gap can reduce the effectiveness of the developed controller. In this paper, we examine a case study of transferring an octorotor reinforcement learning controller from simulation to the…
    被引用次数:11,214
  8. Simple statistical gradient-following algorithms for connectionist reinforcement learning

    Ronald J. Williams - Machine Learning - 1992
    该记录暂无摘要,您可以通过来源链接查看详细信息。
    被引用次数:7,358
  9. Continuous control with deep reinforcement learning

    Timothy Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess - arXiv (Cornell University) - 2016
    Abstract: We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems suc…
    被引用次数:6,769
  10. Playing Atari with Deep Reinforcement Learning

    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves - arXiv (Cornell University) - 2013
    We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no…
    被引用次数:5,111