ProMP: Proximal Meta-Policy Search

Overview

Build Status Docs

ProMP: Proximal Meta-Policy Search

Implementations corresponding to ProMP (Rothfuss et al., 2018). Overall this repository consists of two branches:

  1. master: lightweight branch that provides the necessary code to run Meta-RL algorithms such as ProMP, E-MAML, MAML. This branch is meant to provide an easy start with Meta-RL and can be integrated into other projects and setups.
  2. full-code: branch that provides the comprehensive code that was used to produce the experimental results in Rothfuss et al. (2018). This includes experiment scripts and plotting scripts that can be used to reproduce the experimental results in the paper.

The code is written in Python 3 and builds on Tensorflow. Many of the provided reinforcement learning environments require the Mujoco physics engine. Overall the code was developed under consideration of modularity and computational efficiency. Many components of the Meta-RL algorithm are parallelized either using either MPI or Tensorflow in order to ensure efficient use of all CPU cores.

Documentation

An API specification and explanation of the code components can be found here. Also the documentation can be build locally by running the following commands

# ensure that you are in the root folder of the project
cd docs
# install the sphinx documentaiton tool dependencies
pip install requirements.txt
# build the documentaiton
make clean && make html
# now the html documentation can be found under docs/build/html/index.html

Installation / Dependencies

The provided code can be either run in A) docker container provided by us or B) using python on your local machine. The latter requires multiple installation steps in order to setup dependencies.

A. Docker

If not installed yet, set up docker on your machine. Pull our docker container jonasrothfuss/promp from docker-hub:

docker pull jonasrothfuss/promp

All the necessary dependencies are already installed inside the docker container.

B. Anaconda or Virtualenv

B.1. Installing MPI

Ensure that you have a working MPI implementation (see here for more instructions).

For Ubuntu you can install MPI through the package manager:

sudo apt-get install libopenmpi-dev
B.2. Create either venv or conda environment and activate it
Virtualenv
pip install --upgrade virtualenv
virtualenv 
   
    
source 
    
     /bin/activate

    
   
Anaconda

If not done yet, install anaconda by following the instructions here. Then reate a anaconda environment, activate it and install the requirements in requirements.txt.

conda create -n 
   
     python=3.6
source activate 
    

    
   
B.3. Install the required python dependencies
pip install -r requirements.txt
B.4. Set up the Mujoco physics engine and mujoco-py

For running the majority of the provided Meta-RL environments, the Mujoco physics engine as well as a corresponding python wrapper are required. For setting up Mujoco and mujoco-py, please follow the instructions here.

Running ProMP

In order to run the ProMP algorithm point environment (no Mujoco needed) with default configurations execute:

python run_scripts/pro-mp_run_point_mass.py 

To run the ProMP algorithm in a Mujoco environment with default configurations:

python run_scripts/pro-mp_run_mujoco.py 

The run configuration can be change either in the run script directly or by providing a JSON configuration file with all the necessary hyperparameters. A JSON configuration file can be provided through the flag. Additionally the dump path can be specified through the dump_path flag:

python run_scripts/pro-mp_run.py --config_file 
   
     --dump_path 
    

    
   

Additionally, in order to run the the gradient-based meta-learning methods MAML and E-MAML (Finn et. al., 2017 and Stadie et. al., 2018) in a Mujoco environment with the default configuration execute, respectively:

python run_scripts/maml_run_mujoco.py 
python run_scripts/e-maml_run_mujoco.py 

Cite

To cite ProMP please use

@article{rothfuss2018promp,
  title={ProMP: Proximal Meta-Policy Search},
  author={Rothfuss, Jonas and Lee, Dennis and Clavera, Ignasi and Asfour, Tamim and Abbeel, Pieter},
  journal={arXiv preprint arXiv:1810.06784},
  year={2018}
}

Acknowledgements

This repository includes environments introduced in (Duan et al., 2016, Finn et al., 2017).

Owner
Jonas Rothfuss
Doctoral researcher - Institute of Machine Learning (ETH Zurich) Research emphasis on meta-learning and reinforcement learning
Jonas Rothfuss
Adversarial Attacks are Reversible via Natural Supervision

Adversarial Attacks are Reversible via Natural Supervision ICCV2021 Citation @InProceedings{Mao_2021_ICCV, author = {Mao, Chengzhi and Chiquier

Computer Vision Lab at Columbia University 20 May 22, 2022
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 60 Jan 04, 2023
The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation The source code of our work "SIDE: Center-based Stereo 3D Detecto

10 Dec 18, 2022
Mixup for Supervision, Semi- and Self-Supervision Learning Toolbox and Benchmark

OpenSelfSup News Downstream tasks now support more methods(Mask RCNN-FPN, RetinaNet, Keypoints RCNN) and more datasets(Cityscapes). 'GaussianBlur' is

AI Lab, Westlake University 332 Jan 03, 2023
Keras implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 8.9k Jan 04, 2023
Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

THESIS_CAIRONE_FIORENTINO Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques" GENERATE TOKE

cairone_fiorentino97 1 Dec 10, 2021
Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance Project Page | Paper | Data This repository contains an implementatio

Lior Yariv 521 Dec 30, 2022
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 11 Nov 22, 2022
Near-Duplicate Video Retrieval with Deep Metric Learning

Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr

2 Jan 24, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
MINERVA: An out-of-the-box GUI tool for offline deep reinforcement learning

MINERVA is an out-of-the-box GUI tool for offline deep reinforcement learning, designed for everyone including non-programmers to do reinforcement learning as a tool.

Takuma Seno 80 Nov 06, 2022
FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation

FCN_via_Keras FCN FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This

Kento Watanabe 48 Aug 30, 2022
Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation

Tiny-NewsRec The source codes for our paper "Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation". Requirements PyTorch == 1.6.0 Tensor

Yang Yu 3 Dec 07, 2022
Official Implementation (PyTorch) of "Point Cloud Augmentation with Weighted Local Transformations", ICCV 2021

PointWOLF: Point Cloud Augmentation with Weighted Local Transformations This repository is the implementation of PointWOLF(To appear). Sihyeon Kim1*,

MLV Lab (Machine Learning and Vision Lab at Korea University) 16 Nov 03, 2022
Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking (CVPR 2021) Pytorch implementation of the ArTIST motion model. In this repo

Fatemeh 38 Dec 12, 2022
Relative Uncertainty Learning for Facial Expression Recognition

Relative Uncertainty Learning for Facial Expression Recognition The official implementation of the following paper at NeurIPS2021: Title: Relative Unc

35 Dec 28, 2022
The Instructed Glacier Model (IGM)

The Instructed Glacier Model (IGM) Overview The Instructed Glacier Model (IGM) simulates the ice dynamics, surface mass balance, and its coupling thro

27 Dec 16, 2022
Deep learning model for EEG artifact removal

DeepSeparator Introduction Electroencephalogram (EEG) recordings are often contaminated with artifacts. Various methods have been developed to elimina

23 Dec 21, 2022
darija <-> english dictionary

darija-dictionary Having advanced IT solutions that are well adapted to the Moroccan context passes inevitably through understanding Moroccan dialect.

DODa 102 Jan 01, 2023
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 0 Dec 15, 2022