NeurIPS2020

Structural Abstraction in RL

Papers


Learning Graph Structure with FSA Layer

Daniel D. Johnson, Hugo Larochelle, Daniel Tarlow

Motivation; Deciding connections in graphs is an important part of relational representations ****

Summary: A Layer which suggests extra edges to add to a graph to improve performance on downstream tasks

$L$-Paths

In graph $G = ( N, E )$ , let $L$ be a formal language over $\sum = N \cup E$ $( L \subseteq \sum^*)$

An $L$-Path exists between $n_0$ and $n_t$ if the word corresponding to the $n_0$ to $n_1$ path is in $L$

Regular languages $L$ can be constructed to correspond to specific relations

More generally, the existence of L-paths summarizes the presence of longer chains of relationships with a single pairwise relation. We may not always know which language L would be useful for a task of interest; our approach makes it possible to jointly learn L and use the L-paths as an abstraction for the downstream task.

Graphs as Rewardless POMDPS

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled.png

Method

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%201.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%202.png

Experiments

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%203.png


Background for Long Horizon Tasks


Long Horizon Visual Planning with Goal Conditioned Hierarchical Predictors

Karl Pertsch, Oleh Rybkin, Frederik Ebert, Shenghao Zhou, Dinesh Jayaraman, Chelsea Finn, Sergey Levine

Motivation: Current methods don't scale well to long horizon planning. Many sequential planners suffer from accumulation of error.

Background:

Summary: Use a hierarchical goal conditioned approach to sample trajectories that will

Method:

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%204.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%205.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%206.png

Experiments:

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%207.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%208.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%209.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2010.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2011.png


Sparse Graphical Memory for Robust Planning

Scott Emmons, Ajay Jain, Misha Laskin, Thanard Kurutach, Pieter Abbeel, Deepak Pathak

Motivation: Current Deep RL methods don't scale to long-horizon tasks unlike classical methods such as A* search.

Background

Summary: Combine long horizon ability of classical planning with the flexibility of Deep RL by incorporating graph-based component which is actively kept sparse during exploration.

Method

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2012.png

High level policy - Djikstra's algorithm

Low level policy - D3PG

Sparse Graphical Memory is constructed from a buffer of exploration trajectories by retaining the minimal set of states that fail our two-way consistency aggregation criterion

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2013.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2014.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2015.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2016.png

Experiments

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2017.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2018.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2019.png


Neurosymbolic RL with Formally Verified Exploration

Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri

Motivation: Safe RL where agent follows some contraints

Background

Summary: Uses neurosymbolic policy which can be updated through gradient descent and can be easily verified

Problem Formulation

Policy Structure

Policy $\pi = (g, \phi, f)$ where action is selected according to

$h(s) = \textbf{if} \ (P^{#}(s, f(s)) ⊆ \phi) \ \textbf{then} \ f(s) \ \textbf{else} \ g(s)$

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2020.png

Learning Algorithm

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2021.png

Note: The combined policy stays safe throughout the procedure

Regret Bound

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2022.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2023.png

Experiments

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2024.png

Cost vs episode plots

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2025.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2026.png


Boolean Task Algebra for Reinforcement Learning

Geraud Nangue Tasse, Steven James, Benjamin Rosman

Motivation: Be able to compose previously learned skills to solve novel tasks

Background:

Summary: Theory devloping boolean algebra for MDPs under some restrictions

Theory

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2027.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2028.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2029.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2030.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2031.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2032.png

Experiments:

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2033.png

Structural%20Abstraction%20in%20RL%20f6eaae97eaee49c6b939b07d3c31a710/Untitled%2034.png


Discussion