View article

[PDF] from aaai.org

Maximum entropy inverse reinforcement learning

Authors

Brian D Ziebart, Andrew Maas, J Andrew Bagnell, Anind K Dey

Publication date

2008/7/13

Journal

Proc. AAAI

Pages

1433-1438

Description

Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a near-optimal policy closely mimic demonstrated behavior. In this work, we develop a probabilistic approach based on the principle of maximum entropy. Our approach provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods.

We develop our technique in the context of modeling realworld navigation and driving behaviors where collected data is inherently noisy and imperfect. Our probabilistic approach enables modeling of route preferences as well as a powerful new approach to inferring destinations and routes based on partial trajectories.

Total citations

Cited by 3365

200920102011201220132014201520162017201820192020202120222023202416 18 27 44 59 86 73 126 145 273 330 407 515 497 550 184

Scholar articles

Maximum entropy inverse reinforcement learning.

BD Ziebart, AL Maas, JA Bagnell, AK Dey - Aaai, 2008

Cited by 3365 Related articles All 25 versions