Programmatic policy extraction by iterative local search

Rasmus Larsen, Mikkel N. Schmidt

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

50 Downloads (Pure)

Abstract

Reinforcement learning policies are often represented by neural networks, but programmatic policies are preferred in some cases because they are more interpretable, amenable to formal verification, or generalize better. While efficient algorithms for learning neural policies exist, learning programmatic policies is challenging. Combining imitation-projection and dataset aggregation with a local search heuristic, we present a simple and direct approach to extracting a programmatic policy from a pretrained neural policy. After examining our local search heuristic on a programming by example problem, we demonstrate our programmatic policy extraction method on a pendulum swing-up problem. Both when trained using a hand crafted expert policy and a learned neural policy, our method discovers simple and interpretable policies that perform almost as well as the original.
Original languageEnglish
Title of host publicationInductive Logic Programming
PublisherSpringer
Publication date2022
Pages156–166
ISBN (Print)978-3-030-97453-4
DOIs
Publication statusPublished - 2022
Event30th International Conference on Inductive Logic Programming - Virtual Event, Athens, Greece
Duration: 25 Oct 202127 Oct 2021
http://lr2020.iit.demokritos.gr/ilp/

Conference

Conference30th International Conference on Inductive Logic Programming
LocationVirtual Event
Country/TerritoryGreece
CityAthens
Period25/10/202127/10/2021
Internet address
SeriesLecture Notes in Computer Science
Volume13191
ISSN0302-9743

Keywords

  • Program synthesis
  • Reinforcement learning
  • Hindley-Milner type system
  • Neighborhood search

Fingerprint

Dive into the research topics of 'Programmatic policy extraction by iterative local search'. Together they form a unique fingerprint.

Cite this