PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Overview

PantheonRL

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. The goal of PantheonRL is to provide a modular and extensible framework for training agent policies, fine-tuning agent policies, ad-hoc pairing of agents, and more. PantheonRL also provides a web user interface suitable for lightweight experimentation and prototyping.

PantheonRL is built on top of StableBaselines3 (SB3), allowing direct access to many of SB3's standard RL training algorithms such as PPO. PantheonRL currently follows a decentralized training paradigm -- each agent is equipped with its own replay buffer and update algorithm. The agents objects are designed to be easily manipulable. They can be saved, loaded and plugged into different training procedures such as self-play, ad-hoc / cross-play, round-robin training, or finetuning.

This package will be presented as a demo at the AAAI-22 Demonstrations Program.

Demo Paper

Demo Video

"PantheonRL: A MARL Library for Dynamic Training Interactions"
Bidipta Sarkar*, Aditi Talati*, Andy Shih*, Dorsa Sadigh
In Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track), 2022

@inproceedings{sarkar2021pantheonRL,
  title={PantheonRL: A MARL Library for Dynamic Training Interactions},
  author={Sarkar, Bidipta and Talati, Aditi and Shih, Andy and Sadigh Dorsa},
  booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track)},
  year={2022}
}

Installation

# Optionally create conda environments
conda create -n PantheonRL python=3.7
conda activate PantheonRL

# Clone and install PantheonRL
git clone https://github.com/Stanford-ILIAD/PantheonRL.git
cd PantheonRL
pip install -e .

Overcooked Installation

# Optionally install Overcooked environment
git submodule update --init --recursive
pip install -e overcookedgym/human_aware_rl/overcooked_ai

PettingZoo Installation

# Optionally install PettingZoo environments
pip install pettingzoo

# to install a group of pettingzoo environments
pip install "pettingzoo[classic]"

Command Line Invocation

Example

python3 trainer.py LiarsDice-v0 PPO PPO --seed 10 --preset 1
# requires Overcooked installation (see above instructions)
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 10 --preset 1

For examples on round-robin training followed by partner adaptation, check out these instructions.

For more examples, check out the examples/ directory.

Web User Interface

The first time the web interface is being run in a new location, the database must be initialized. After that, the init-db command should not be called again, because this will clear all user account data.

Set environment variables and (re)inititalize the database

export FLASK_APP=website
export FLASK_ENV=development
flask init-db

Start the web user interface. Make sure that ports 5000 and 5001 (used for Tensorboard) are not taken.

flask run --host=0.0.0.0 --port=5000


Agent selection screen. Users can customize the ego and partner agents.


Training screen. Users can view basic information, or spawn a Tensorboard tab for full monitoring.

Features

General Features PantheonRL
Documentation ✔️
Web user interface ✔️
Built on top of SB3 ✔️
Supports PettingZoo Envs ✔️
Environment Features PantheonRL
Frame stacking (recurrence) ✔️
Simultaneous multiagent envs ✔️
Turn-based multiagent envs ✔️
2-player envs ✔️
N-player envs ✔️
Custom environments ✔️
Training Features PantheonRL
Self-play ✔️
Ad-hoc / cross-play ✔️
Round-robin training ✔️
Finetune / adapt to new partners ✔️
Custom policies ✔️

Current Environments

Name Environment Type Reward Type Players Visualization
Rock Paper Scissors SimultaneousEnv Competitive 2
Liar's Dice TurnBasedEnv Competitive 2
Block World [1] TurnBasedEnv Cooperative 2 ✔️
Overcooked [2] SimultaneousEnv Cooperative 2 ✔️
PettingZoo [3] Mixed Mixed N ✔️

[1] Adapted from the block construction task from https://github.com/cogtoolslab/compositional-abstractions

[2] Adapted from the Human_Aware_Rl / Overcooked AI package from https://github.com/HumanCompatibleAI/human_aware_rl

[3] PettingZoo environments from https://github.com/Farama-Foundation/PettingZoo

Owner
Stanford Intelligent and Interactive Autonomous Systems Group
Stanford Intelligent and Interactive Autonomous Systems Group
Stanford Intelligent and Interactive Autonomous Systems Group
Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".

Optimization for Oriented Object Detection via Representation Invariance Loss By Qi Ming, Zhiqiang Zhou, Lingjuan Miao, Xue Yang, and Yunpeng Dong. Th

ming71 56 Nov 28, 2022
Code for the paper "On the Power of Edge Independent Graph Models"

Edge Independent Graph Models Code for the paper: "On the Power of Edge Independent Graph Models" Sudhanshu Chanpuriya, Cameron Musco, Konstantinos So

Konstantinos Sotiropoulos 0 Oct 26, 2021
Selective Wavelet Attention Learning for Single Image Deraining

SWAL Code for Paper "Selective Wavelet Attention Learning for Single Image Deraining" Prerequisites Python 3 PyTorch Models We provide the models trai

Bobo 9 Jun 17, 2022
This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC

LEAP Lab 2 Sep 15, 2022
A Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Training Data》

RangeLoss Pytorch This is a Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Trai

Youzhi Gu 7 Nov 27, 2021
Pose estimation with MoveNet Lightning

Pose Estimation With MoveNet Lightning MoveNet is the TensorFlow pre-trained model that identifies 17 different key points of the human body. It is th

Yash Vora 2 Jan 04, 2022
Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Sarus published models Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are

Sarus Technologies 39 Aug 19, 2022
tmm_fast is a lightweight package to speed up optical planar multilayer thin-film device computation.

tmm_fast tmm_fast or transfer-matrix-method_fast is a lightweight package to speed up optical planar multilayer thin-film device computation. It is es

26 Dec 11, 2022
It's final year project of Diploma Engineering. This project is based on Computer Vision.

Face-Recognition-Based-Attendance-System It's final year project of Diploma Engineering. This project is based on Computer Vision. Brief idea about ou

Neel 10 Nov 02, 2022
Bottleneck Transformers for Visual Recognition

Bottleneck Transformers for Visual Recognition Experiments Model Params (M) Acc (%) ResNet50 baseline (ref) 23.5M 93.62 BoTNet-50 18.8M 95.11% BoTNet-

Myeongjun Kim 236 Jan 03, 2023
Python implementation of Wu et al (2018)'s registration fusion

reg-fusion Projection of a central sulcus probability map using the RF-ANTs approach (right hemisphere shown). This is a Python implementation of Wu e

Dan Gale 26 Nov 12, 2021
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

Tian Xie 94 Dec 10, 2022
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022
Distributing reference energies for SMIRNOFF implementations

Warning: This code is currently experimental and under active development. Is it not yet suitable for distribution or use as reference implementation.

Open Force Field Initiative 1 Dec 07, 2021
Detector for Log4Shell exploitation attempts

log4shell-detector Detector for Log4Shell exploitation attempts Idea The problem with the log4j CVE-2021-44228 exploitation is that the string can be

Florian Roth 729 Dec 25, 2022
Unofficial pytorch-lightning implement of Mip-NeRF

mipnerf_pl Unofficial pytorch-lightning implement of Mip-NeRF, Here are some results generated by this repository (pre-trained models are provided bel

Jianxin Huang 159 Dec 23, 2022
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Transparency-by-Design networks (TbD-nets) This repository contains code for replicating the experiments and visualizations from the paper Transparenc

David Mascharka 351 Nov 18, 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文 《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

75 Dec 29, 2022
Cluttered MNIST Dataset

Cluttered MNIST Dataset A setup script will download MNIST and produce mnist/*.t7 files: luajit download_mnist.lua Example usage: local mnist_clutter

DeepMind 50 Jul 12, 2022
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation This the repository for this paper. Find extensions of this w

Zhuoyuan Mao 14 Oct 26, 2022