(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Last update: Dec 20, 2022

Related tags

Overview

Inverse Q-Learning (IQ-Learn)

Official code base for IQ-Learn: Inverse soft-Q Learning for Imitation, NeurIPS '21 Spotlight

IQ-Learn is an easy-to-use algorithm that's a drop-in replacement to methods like Behavior Cloning and GAIL, to boost your imitation learning pipelines!
Update: IQ-Learn was recently used to create the best AI agent for playing Minecraft. Placing #1 in NeurIPS MineRL Basalt Challenge using only human demos (Overall Leaderboard Rank #2)

[Project Page]

We introduce Inverse Q-Learning (IQ-Learn), a state-of-the-art novel framework for Imitation Learning (IL), that directly learns soft-Q functions from expert data. IQ-Learn enables non-adverserial imitation learning, working on both offline and online IL settings. It is performant even with very sparse expert data, and scales to complex image-based environments, surpassing prior methods by more than 3x. It is very simple to implement requiring ~15 lines of code on top of existing RL methods.

Inverse Q-Learning is theoretically equivalent to Inverse Reinforcement learning, i.e. learning rewards from expert data. However, it is much more powerful in practice. It admits very simple non-adverserial training and works on complete offline IL settings (without any access to the environment), greatly exceeding Behavior Cloning.

IQ-Learn is the successor to Adversarial Imitation Learning methods like GAIL (coming from the same lab).
It extends the theoretical framework for Inverse RL to non-adverserial and scalable learning, for the first-time showing guaranteed convergence.

Citation

@inproceedings{garg2021iqlearn,
title={IQ-Learn: Inverse soft-Q Learning for Imitation},
author={Divyansh Garg and Shuvam Chakraborty and Chris Cundy and Jiaming Song and Stefano Ermon},
booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
year={2021},
url={https://openreview.net/forum?id=Aeo-xqtb5p}
}

Key Advantages

✅ Drop-in replacement to Behavior Cloning
✅ Non-adverserial online IL (Successor to GAIL & AIRL)
✅ Simple to implement
✅ Performant with very sparse data (single expert demo)
✅ Scales to Complex Image Envs (SOTA on Atari and playing Minecraft)
✅ Recover rewards from envs

Usage

To install and use IQ-Learn check the instructions provided in the iq_learn folder.

Imitation

Reaching human-level performance on Atari with pure imitation:

Rewards

Recovering environment rewards on GridWorld:

Questions

Please feel free to email us if you have any questions.

Div Garg ([email protected])

(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Related tags

Overview

Inverse Q-Learning (IQ-Learn)

Citation

Key Advantages

Usage

Imitation

Rewards

Questions

Owner

Divyansh Garg

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

Graph Attention Networks

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation

Let's create a tool to convert Thailand budget from PDF to CSV.

FB-tCNN for SSVEP Recognition

Learning To Have An Ear For Face Super-Resolution

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Interpolation-based reduced-order models

Perturb-and-max-product: Sampling and learning in discrete energy-based models

A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Tgbox-bench - Simple TGBOX upload speed benchmark

Multi-Horizon-Forecasting-for-Limit-Order-Books

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

Walk with fastai

Equivariant CNNs for the sphere and SO(3) implemented in PyTorch

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022