Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Last update: Dec 23, 2021

Related tags

Overview

ALPHAMEPOL

This repository contains the implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Installation

In order to use this codebase you need to work with a Python version >= 3.6. Moreover, you need to have a working setup of Mujoco with a valid Mujco license. To setup Mujoco, have a look here. To avoid any conflict with your existing Python setup, and to keep this project self-contained, it is suggested to work in a virtual environment with virtualenv. To install virtualenv:

pip install --upgrade virtualenv

Create a virtual environment, activate it and install the requirements:

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

Unsupervised Pre-Training

To reproduce the Unsupervised Pre-Training experiments in the paper, run:

./scripts/exploration/[gridworld_with_slope.sh | multigrid.sh | ant.sh | minigrid.sh]

Supervised Fine-Tuning

To reproduce the Supervised Fine-Tuning experiments, run:

./scripts/goal_rl/[gridworld_with_slope.sh | multigrid.sh | ant.sh | minigrid.sh]

By default, this will launch TRPO with ALPHAMEPOL initialization. To launch TRPO with a random initialization, simply omit the policy_init argument in the scripts.

Moreover, note that the scripts for the GridWorld with Slope and MultiGrid experiments have the argument num_goals = 50, meaning that the training will be performed with one goal at a time. If you want to speed up the process, you can use several processes (ideally one for each goal), by passing as argument num_goals = 1 and changing incrementally the seed. As regards the Ant and MiniGrid experiments, since the goals are predefined, you can also set the goal_index argument to specify a goal (from 0 to 7 and from 0 to 12 respectively).

Results Visualization

Once launched, each experiment will log statistics in the results folder. You can visualize everything by launching tensorboard targeting that directory:

python -m tensorboard.main --logdir=./results --port 8080

and visiting the board at http://localhost:8080.

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Related tags

Overview

ALPHAMEPOL

Installation

Usage

Unsupervised Pre-Training

Supervised Fine-Tuning

Results Visualization

Owner

Code for reproducing experiments in "Improved Training of Wasserstein GANs"

An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

A developer interface for creating Chat AIs for the Chai app.

A font family with a great monospaced variant for programmers.

Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

[CVPR 2021] MiVOS - Scribble to Mask module

A small library for doing fluid simulation with neural networks.

[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

《DeepViT: Towards Deeper Vision Transformer》(2021)

NeurIPS workshop paper 'Counter-Strike Deathmatch with Large-Scale Behavioural Cloning'

Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Finetuning Pipeline

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.