Scalable Multi-Agent Reinforcement Learning

Overview

Scalable Multi-Agent Reinforcement Learning

1. Featured algorithms:

  • Value Function Factorization with Variable Agent Sub-Teams (VAST) [1]

2. Implemented domains

All available domains are listed in the table below. The labels are used for the commands below (in 5. and 6.).

Domain Label Description
Warehouse[4] Warehouse-4 Warehouse domain with 4 agents in a 5x3 grid.
Warehouse[8] Warehouse-8 Warehouse domain with 8 agents in a 5x5 grid.
Warehouse[16] Warehouse-16 Warehouse domain with 16 agents in a 9x13 grid.
Battle[20] Battle-20 Battle domain with armies of 20 agents each in a 10x10 grid.
Battle[40] Battle-40 Battle domain with armies of 40 agents each in a 14x14 grid.
Battle[80] Battle-80 Battle domain with armies of 80 agents each in a 18x18 grid.
GaussianSqueeze[200] GaussianSqueeze-200 Gaussian squeeze domain 200 agents.
GaussianSqueeze[400] GaussianSqueeze-400 Gaussian squeeze domain 400 agents.
GaussianSqueeze[800] GaussianSqueeze-800 Gaussian squeeze domain 800 agents.

3. Implemented MARL algorithms

The reported MARL algorithms are listed in the tables below. The labels are used for the commands below (in 5. and 6.).

Baseline Label
IL IL
QMIX QMIX
QTRAN QTRAN
VAST(VFF operator) Label
VAST(IL) VAST-IL
VAST(VDN) VAST-VDN
VAST(QMIX) VAST-QMIX
VAST(QTRAN) VAST-QTRAN
VAST(assignment strategy) Label
VAST(Random) VAST-QTRAN-RANDOM
VAST(Fixed) VAST-QTRAN-FIXED
VAST(Spatial) VAST-QTRAN-SPATIAL
VAST(MetaGrad) VAST-QTRAN

4. Experiment parameters

The experiment parameters like the learning rate for training (params["learning_rate"]) or the number of episodes per epoch (params["episodes_per_epoch"]) are specified in settings.py. All other hyperparameters are set in the corresponding python modules in the package vast/controllers, where all final values as listed in the technical appendix are specified as default value.

All hyperparameters can be adjusted by setting their values via the params dictionary in settings.py.

5. Training

To train a MARL algorithm M (see tables in 3.) in domain D (see table in 2.) with compactness factor eta, run the following command:

python train.py M D eta

This command will create a folder with the name pattern output/N-agents_domain-D_subteams-S_M_datetime which contains the trained models (depending on the MARL algorithm).

train.sh is an example script for running all settings as specified in the paper.

6. Plotting

To generate plots for a particular domain D and evaluation mode E as presented in the paper, run the following command:

python plot.py M E

The command will load and display all the data of completed training runs that are stored in the folder which is specified in params["output_folder"] (see settings.py).

The evaluation mode E are specified in the table below:

Evaluation mode Label
VFF operator comparison F
State-of-the-art comparison S
Assignment strategy comparison A
Division diversity comparison D

7. Rendering

To render episodes of the Warehouse[N] or Battle[N] domain, set params["render_pygame"]=True in settings.py.

8. References

  • [1] T. Phan et al., "VAST: Value Function Factorization with Variable Agent Sub-Teams", in NeurIPS 2021
Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

MosaicOS Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation. Introduction M

Cheng Zhang 27 Oct 12, 2022
A modular application for performing anomaly detection in networks

Deep-Learning-Models-for-Network-Annomaly-Detection The modular app consists for mainly three annomaly detection algorithms. The system supports model

Shivam Patel 1 Dec 09, 2021
[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Efficient Graph Similarity Computation - (EGSC) This repo contains the source code and dataset for our paper: Slow Learning and Fast Inference: Effici

24 Dec 31, 2022
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization

Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp

Sheng Yang 10 Dec 05, 2022
Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Attention Is All You Need Paper Implementation This is my from-scratch implementation of the original transformer architecture from the following pape

Brando Koch 195 Dec 30, 2022
Official PyTorch implementation of "Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient".

Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient This repository is the official PyTorch implementation of "Edge Rewiring Go

Shanchao Yang 4 Dec 12, 2022
Official implementation of Monocular Quasi-Dense 3D Object Tracking

Monocular Quasi-Dense 3D Object Tracking Monocular Quasi-Dense 3D Object Tracking (QD-3DT) is an online framework detects and tracks objects in 3D usi

Visual Intelligence and Systems Group 441 Dec 20, 2022
A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Rockpool Rockpool is a Python package for developing signal processing applications with spiking neural networks. Rockpool allows you to build network

SynSense 21 Dec 14, 2022
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

bottom-up-attention This code implements a bottom-up attention model, based on multi-gpu training of Faster R-CNN with ResNet-101, using object and at

Peter Anderson 1.3k Jan 09, 2023
Public repo for the ICCV2021-CVAMD paper "Is it Time to Replace CNNs with Transformers for Medical Images?"

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

Christos Matsoukas 80 Dec 27, 2022
Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

Feedback Network for Image Super-Resolution [arXiv] [CVF] [Poster] Update: Our proposed Gated Multiple Feedback Network (GMFN) will appear in BMVC2019

Zhen Li 539 Jan 06, 2023
ChainerRL is a deep reinforcement learning library built on top of Chainer.

ChainerRL and PFRL ChainerRL (this repository) is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement al

Chainer 1.1k Jan 01, 2023
A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution

DRSAN A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution Karam Park, Jae Woong Soh, and Nam Ik Cho Environments U

4 May 10, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more

Bayesian Neural Networks Pytorch implementations for the following approximate inference methods: Bayes by Backprop Bayes by Backprop + Local Reparame

1.4k Jan 07, 2023
natural image generation using ConvNets

The Eyescream Project Generating Natural Images using Neural Networks. For our research summary on this work, please read the Arxiv paper: http://arxi

Meta Archive 601 Nov 23, 2022
Tools for investing in Python

InvestOps Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction This is a Python package with simple and effective

24 Nov 26, 2022
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

CycleGAN PyTorch | project page | paper Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for

Jun-Yan Zhu 11.5k Dec 30, 2022
Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

Flow Flow is a computational framework for deep RL and control experiments for traffic microsimulation. See our website for more information on the ap

867 Jan 02, 2023
Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

Tianhong Dai 5 Nov 16, 2022