Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

Normal view MARC view ISBD view

Algorithms for reinforcement learning

By: Szepesvári, Csaba.
Material type: materialTypeLabelBookSeries: Synthesis digital library of engineering and computer science: ; Synthesis lectures on artificial intelligence and machine learning: # 9.Publisher: San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool, c2010Description: 1 electronic text (xii, 89 p. : ill.) : digital file.ISBN: 9781608454938 (electronic bk.).Subject(s): Reinforcement learning -- Mathematical models | Reinforcement learning | Markov Decision Processes | Temporal difference learning | Stochastic approximation | Two-timescale stochastic approximation | Monte-Carlo methods | Simulation optimization | Function approximation | Stochastic gradient methods | Least-squares methods | Overfitting | Bias-variance tradeoff | Online learning | Active learning | Planning | Simulation | PAC-learning | Q-learning | Actor-critic methods | Policy gradient | Natural gradientDDC classification: 006.31 Online resources: Abstract with links to resource Also available in print.
Contents:
1. Markov decision processes -- Preliminaries -- Markov decision processes -- Value functions -- Dynamic programming algorithms for solving MDPs --
2. Value prediction problems -- Temporal difference learning in finite state spaces -- Tabular TD(0) -- Every-visit Monte-Carlo -- TD([lambda]): unifying Monte-Carlo and TD(0) -- Algorithms for large state spaces -- TD([lambda]) with function approximation -- Gradient temporal difference learning -- Least-squares methods -- The choice of the function space --
3. Control -- A catalog of learning problems -- Closed-loop interactive learning -- Online learning in bandits -- Active learning in bandits -- Active learning in Markov decision processes -- Online learning in Markov decision processes -- Direct methods -- Q-learning in finite MDPs -- Q-learning with function approximation -- Actor-critic methods -- Implementing a critic -- Implementing an actor --
4. For further exploration -- Further reading -- Applications -- Software --
A. The theory of discounted Markovian decision processes -- A.1. Contractions and Banach's fixed-point theorem -- A.2. Application to MDPs -- Bibliography -- Author's biography.
Abstract: Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
E books E books PK Kelkar Library, IIT Kanpur
Available EBKE265
Total holds: 0

Mode of access: World Wide Web.

System requirements: Adobe Acrobat Reader.

Part of: Synthesis digital library of engineering and computer science.

Series from website.

Includes bibliographical references (p. 73-88).

1. Markov decision processes -- Preliminaries -- Markov decision processes -- Value functions -- Dynamic programming algorithms for solving MDPs --

2. Value prediction problems -- Temporal difference learning in finite state spaces -- Tabular TD(0) -- Every-visit Monte-Carlo -- TD([lambda]): unifying Monte-Carlo and TD(0) -- Algorithms for large state spaces -- TD([lambda]) with function approximation -- Gradient temporal difference learning -- Least-squares methods -- The choice of the function space --

3. Control -- A catalog of learning problems -- Closed-loop interactive learning -- Online learning in bandits -- Active learning in bandits -- Active learning in Markov decision processes -- Online learning in Markov decision processes -- Direct methods -- Q-learning in finite MDPs -- Q-learning with function approximation -- Actor-critic methods -- Implementing a critic -- Implementing an actor --

4. For further exploration -- Further reading -- Applications -- Software --

A. The theory of discounted Markovian decision processes -- A.1. Contractions and Banach's fixed-point theorem -- A.2. Application to MDPs -- Bibliography -- Author's biography.

Abstract freely available; full-text restricted to subscribers or individual document purchasers.

Compendex

INSPEC

Google scholar

Google book search

Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations.

Also available in print.

Title from PDF t.p. (viewed on July 13, 2010).

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha