Learning Heuristics via Imitation of Optimal Planners


2017_learningheuristics

A heuristic is a policy that takes as input the state of the search, i.e all decisions undertaken and outcomes observed, and decides which vertex to expand. The perfect heuristic is one that selects exactly the vertices that belong to the optimal path, thus being the most computationally efficient. However, the optimal path itself depends on the underlying world map which is partially known, revealed through the decisions made by the search. Hence it is important for a heuristic to update its strategy as it knows more about the world. We explore efficient methods to train such heuristics by imitating optimal planners.


Related Publications

Data-driven Planning via Imitation Learning
Sanjiban Choudhury, Mohak Bhardwaj, Sankalp Arora, Ashish Kapoor, Gireeja Ranade, Sebastian Scherer, Debadeepta Dey
The International Journal of Robotics Research, July, 2018
pdf   /   arXiv
Learning Heuristic Search via Imitation
Mohak Bhardwaj, Sanjiban Choudhury, Sebastian Scherer
Conference on Robot Learning (CoRL), 2017
pdf  /   talk  /   poster  /   arXiv