Disagreement-Regularized Imitation Learning

Overview

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in codebase for where the bug was fixed at. [link]

Disagreement-Regularized Imitation Learning

Code to train the models described in the paper "Disagreement-Regularized Imitation Learning", by Kianté Brantley, Wen Sun and Mikael Henaff.

Usage:

Install using pip

Install the DRIL package

pip install -e .

Software Dependencies

"stable-baselines", "rl-baselines-zoo", "baselines", "gym", "pytorch", "pybullet"

Data

We provide a python script to generate expert data from per-trained models using the "rl-baselines-zoo" repository. Click "Here" to see all of the pre-trained agents available and their respective perfromance. Replace <name-of-environment> with the name of the pre-trained agent environment you would like to collect expert data for.

python -u generate_demonstration_data.py --seed <seed-number> --env-name <name-of-environment> --rl_baseline_zoo_dir <location-to-top-level-directory>

Training

DRIL requires a per-trained ensemble model and a per-trained behavior-cloning model.

Note that <location-to-rl-baseline-zoo-directory> is the full-path to the top-level directory to the rl_baseline_zoo repository.

To train only a behavior-cloning model run:

python -u main.py --env-name <name-of-environment> --num-trajs <number-of-trajectories> --behavior_cloning --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>'

To train only a ensemble model run:

python -u main.py --env-name <name-of-environment> --num-trajs <number-of-trajectories> --pretrain_ensemble_only --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>'

To train a DRIL model run the command below. Note that command below first checks that both the behavior cloning model and the ensemble model are trained, if they are not the script will automatically train both the ensemble and behavior-cloning model.

python -u main.py --env-name <name-of-environment> --default_experiment_params <type-of-env>  --num-trajs <number-of-trajectories> --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>  --dril 

--default_experiment_params are the default parameters we use in the DRIL experiments and has two options: atari and continous-control

Visualization

After training the models, the results are stored in a folder called trained_results. Run the command below to reproduce the plots in our paper. If you change any of the hyperparameters, you will need to change the hyperparameters in the plot file naming convention.

python -u plot.py -env <name-of-environment>

Empirical evaluation

Atari

Results on Atari environments. Empirical evaluation

Continous Control

Results on continuous control tasks. Empirical evaluation

Acknowledgement:

We would like to thank Ilya Kostrikov for creating this "repo" that our codebase builds on.

Owner
Kianté Brantley
PhD student at University of Maryland | Member of @umdclip, @coralumbc and @CILVRatNYU | Fitness enthusiast | (He/Him)
Kianté Brantley
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2

168 Dec 29, 2022
Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology

A310 Computational Neuroscience - Okinawa Institute of Science and Technology, 2022 This repository contains modeling practice materials and homework

Sungho Hong 1 Jan 24, 2022
The Body Part Regression (BPR) model translates the anatomy in a radiologic volume into a machine-interpretable form.

Copyright © German Cancer Research Center (DKFZ), Division of Medical Image Computing (MIC). Please make sure that your usage of this code is in compl

MIC-DKFZ 40 Dec 18, 2022
Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

Utkarsh Ojha 251 Dec 11, 2022
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

Sayom Shakib 4 Nov 03, 2022
Several simple examples for popular neural network toolkits calling custom CUDA operators.

Neural Network CUDA Example Several simple examples for neural network toolkits (PyTorch, TensorFlow, etc.) calling custom CUDA operators. We provide

WeiYang 798 Jan 01, 2023
A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

SlowFast A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition. Requirements Anaconda PyTorch conda in

Hao Ren 8 Dec 23, 2022
Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training pro

Yang Wenhan 44 Dec 06, 2022
Official repository for Natural Image Matting via Guided Contextual Attention

GCA-Matting: Natural Image Matting via Guided Contextual Attention The source codes and models of Natural Image Matting via Guided Contextual Attentio

Li Yaoyi 349 Dec 26, 2022
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.

Zhen Dong 36 Dec 02, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

AllenXiang 65 Dec 26, 2022
Face Alignment using python

Face Alignment Face Alignment using python Input Image Aligned Face Aligned Face Aligned Face Input Image Aligned Face Input Image Aligned Face Instal

Sajjad Aemmi 28 Nov 23, 2022
PyTorch Implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedding (ORAL, MICCAIW 2021)

Small Lesion Segmentation in Brain MRIs with Subpixel Embedding PyTorch implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedd

22 Oct 21, 2022
(AAAI2022) Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation

SM-PPM This is a Pytorch implementation of our paper "Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Seman

W-zx-Y 10 Dec 07, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 09, 2023
BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构

BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构。 文档地址:https://basecls.readthedocs.io 安装 安装环境 BaseCls 需要 Python = 3.6。 BaseCls 依赖 M

MEGVII Research 28 Dec 23, 2022
DeepFaceLab fork which provides IPython Notebook to use DFL with Google Colab

DFL-Colab — DeepFaceLab fork for Google Colab This project provides you IPython Notebook to use DeepFaceLab with Google Colaboratory. You can create y

779 Jan 05, 2023
Official implementation of SIGIR'2021 paper: "Sequential Recommendation with Graph Neural Networks".

SURGE: Sequential Recommendation with Graph Neural Networks This is our TensorFlow implementation for the paper: Sequential Recommendation with Graph

FIB LAB, Tsinghua University 53 Dec 26, 2022
SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

SASM (SimpleASM) - простая кроссплатформенная среда разработки для языков ассемблера NASM, MASM, GAS, FASM с подсветкой синтаксиса и отладчиком. В SA

Dmitriy Manushin 5.6k Jan 06, 2023