PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Overview

PantheonRL

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. The goal of PantheonRL is to provide a modular and extensible framework for training agent policies, fine-tuning agent policies, ad-hoc pairing of agents, and more. PantheonRL also provides a web user interface suitable for lightweight experimentation and prototyping.

PantheonRL is built on top of StableBaselines3 (SB3), allowing direct access to many of SB3's standard RL training algorithms such as PPO. PantheonRL currently follows a decentralized training paradigm -- each agent is equipped with its own replay buffer and update algorithm. The agents objects are designed to be easily manipulable. They can be saved, loaded and plugged into different training procedures such as self-play, ad-hoc / cross-play, round-robin training, or finetuning.

This package will be presented as a demo at the AAAI-22 Demonstrations Program.

Demo Paper

Demo Video

"PantheonRL: A MARL Library for Dynamic Training Interactions"
Bidipta Sarkar*, Aditi Talati*, Andy Shih*, Dorsa Sadigh
In Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track), 2022

@inproceedings{sarkar2021pantheonRL,
  title={PantheonRL: A MARL Library for Dynamic Training Interactions},
  author={Sarkar, Bidipta and Talati, Aditi and Shih, Andy and Sadigh Dorsa},
  booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track)},
  year={2022}
}

Installation

# Optionally create conda environments
conda create -n PantheonRL python=3.7
conda activate PantheonRL

# Clone and install PantheonRL
git clone https://github.com/Stanford-ILIAD/PantheonRL.git
cd PantheonRL
pip install -e .

Overcooked Installation

# Optionally install Overcooked environment
git submodule update --init --recursive
pip install -e overcookedgym/human_aware_rl/overcooked_ai

PettingZoo Installation

# Optionally install PettingZoo environments
pip install pettingzoo

# to install a group of pettingzoo environments
pip install "pettingzoo[classic]"

Command Line Invocation

Example

python3 trainer.py LiarsDice-v0 PPO PPO --seed 10 --preset 1
# requires Overcooked installation (see above instructions)
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 10 --preset 1

For examples on round-robin training followed by partner adaptation, check out these instructions.

For more examples, check out the examples/ directory.

Web User Interface

The first time the web interface is being run in a new location, the database must be initialized. After that, the init-db command should not be called again, because this will clear all user account data.

Set environment variables and (re)inititalize the database

export FLASK_APP=website
export FLASK_ENV=development
flask init-db

Start the web user interface. Make sure that ports 5000 and 5001 (used for Tensorboard) are not taken.

flask run --host=0.0.0.0 --port=5000


Agent selection screen. Users can customize the ego and partner agents.


Training screen. Users can view basic information, or spawn a Tensorboard tab for full monitoring.

Features

General Features PantheonRL
Documentation ✔️
Web user interface ✔️
Built on top of SB3 ✔️
Supports PettingZoo Envs ✔️
Environment Features PantheonRL
Frame stacking (recurrence) ✔️
Simultaneous multiagent envs ✔️
Turn-based multiagent envs ✔️
2-player envs ✔️
N-player envs ✔️
Custom environments ✔️
Training Features PantheonRL
Self-play ✔️
Ad-hoc / cross-play ✔️
Round-robin training ✔️
Finetune / adapt to new partners ✔️
Custom policies ✔️

Current Environments

Name Environment Type Reward Type Players Visualization
Rock Paper Scissors SimultaneousEnv Competitive 2
Liar's Dice TurnBasedEnv Competitive 2
Block World [1] TurnBasedEnv Cooperative 2 ✔️
Overcooked [2] SimultaneousEnv Cooperative 2 ✔️
PettingZoo [3] Mixed Mixed N ✔️

[1] Adapted from the block construction task from https://github.com/cogtoolslab/compositional-abstractions

[2] Adapted from the Human_Aware_Rl / Overcooked AI package from https://github.com/HumanCompatibleAI/human_aware_rl

[3] PettingZoo environments from https://github.com/Farama-Foundation/PettingZoo

Owner
Stanford Intelligent and Interactive Autonomous Systems Group
Stanford Intelligent and Interactive Autonomous Systems Group
Stanford Intelligent and Interactive Autonomous Systems Group
Permeability Prediction Via Multi Scale 3D CNN

Permeability-Prediction-Via-Multi-Scale-3D-CNN Data: The raw CT rock cores are obtained from the Imperial Colloge portal. The CT rock cores are sub-sa

Mohamed Elmorsy 2 Jul 06, 2022
Standalone pre-training recipe with JAX+Flax

Sabertooth Sabertooth is standalone pre-training recipe based on JAX+Flax, with data pipelines implemented in Rust. It runs on CPU, GPU, and/or TPU, b

Nikita Kitaev 26 Nov 28, 2022
My solution for the 7th place / 245 in the Umoja Hack 2022 challenge

Umoja Hack 2022 : Insurance Claim Challenge My solution for the 7th place / 245 in the Umoja Hack 2022 challenge Umoja Hack Africa is a yearly hackath

Souames Annis 17 Jun 03, 2022
i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery This is a public code repository for the publication: i-SpaSP: Structured Neural Pruning

Cameron Ronald Wolfe 5 Nov 04, 2022
An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

DeepNER An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models. This repository contains complex Deep

Derrick 9 May 30, 2022
Location-Sensitive Visual Recognition with Cross-IOU Loss

The trained models are temporarily unavailable, but you can train the code using reasonable computational resource. Location-Sensitive Visual Recognit

Kaiwen Duan 146 Dec 25, 2022
Segmentation vgg16 fcn - cityscapes

VGGSegmentation Segmentation vgg16 fcn - cityscapes Priprema skupa skripta prepare_dataset_downsampled.py Iz slika cityscapesa izrezuje haubu automobi

6 Oct 24, 2020
Using PyTorch Perform intent classification using three different models to see which one is better for this task

Using PyTorch Perform intent classification using three different models to see which one is better for this task

Yoel Graumann 1 Feb 14, 2022
This repository contains datasets and baselines for benchmarking Chinese text recognition.

Benchmarking-Chinese-Text-Recognition This repository contains datasets and baselines for benchmarking Chinese text recognition. Please see the corres

FudanVI Lab 254 Dec 30, 2022
Source for the paper "Universal Activation Function for machine learning"

Universal Activation Function Tensorflow and Pytorch source code for the paper Yuen, Brosnan, Minh Tu Hoang, Xiaodai Dong, and Tao Lu. "Universal acti

4 Dec 03, 2022
SpinalNet: Deep Neural Network with Gradual Input

SpinalNet: Deep Neural Network with Gradual Input This repository contains scripts for training different variations of the SpinalNet and its counterp

H M Dipu Kabir 142 Dec 30, 2022
MTCNN face detection implementation for TensorFlow, as a PIP package.

MTCNN Implementation of the MTCNN face detector for Keras in Python3.4+. It is written from scratch, using as a reference the implementation of MTCNN

Iván de Paz Centeno 1.9k Dec 30, 2022
DecoupledNet is semantic segmentation system which using heterogeneous annotations

DecoupledNet: Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation Created by Seunghoon Hong, Hyeonwoo Noh and Bohyung Han at POSTE

Hyeonwoo Noh 74 Sep 22, 2021
DuBE: Duple-balanced Ensemble Learning from Skewed Data

DuBE: Duple-balanced Ensemble Learning from Skewed Data "Towards Inter-class and Intra-class Imbalance in Class-imbalanced Learning" (IEEE ICDE 2022 S

6 Nov 12, 2022
TAPEX: Table Pre-training via Learning a Neural SQL Executor

TAPEX: Table Pre-training via Learning a Neural SQL Executor The official repository which contains the code and pre-trained models for our paper TAPE

Microsoft 157 Dec 28, 2022
[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?

Pri3D: Can 3D Priors Help 2D Representation Learning? [ICCV 2021] Pri3D leverages 3D priors for downstream 2D image understanding tasks: during pre-tr

Ji Hou 124 Jan 06, 2023
Garbage Detection system which will detect objects based on whether it is plastic waste or plastics or just garbage.

Garbage Detection using Yolov5 on Jetson Nano 2gb Developer Kit. Garbage detection system which will detect objects based on whether it is plastic was

Rishikesh A. Bondade 2 May 13, 2022
MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions Project Page | Paper If you find our work useful for your research, please con

96 Jan 04, 2023
Official repository for the ISBI 2021 paper Transformer Assisted Convolutional Neural Network for Cell Instance Segmentation

SegPC-2021 This is the official repository for the ISBI 2021 paper Transformer Assisted Convolutional Neural Network for Cell Instance Segmentation by

Datascience IIT-ISM 13 Dec 14, 2022
Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

arXiv technical report soon available. we are updating the readme to be as comprehensive as possible Please ask any questions in Issues, thanks. Intro

Zhihan 31 Dec 30, 2022