Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

Overview

scalableMARL

Scalable Reinforcement Learning Policies for Multi-Agent Control

CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Learning Policies for Multi-Agent Control". IEEE International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021.

Multi-Agent Reinforcement Learning method to learn scalable control polices for multi-agent target tracking.

  • Author: Christopher Hsu
  • Email: [email protected]
  • Affiliation:
    • Department of Electrical and Systems Engineering
    • GRASP Laboratory
    • @ University of Pennsylvania

Currently supports Python3.8 and is developed in Ubuntu 20.04

scalableMARL file structure

Within scalableMARL (highlighting the important files):

scalableMARL
    |___algos
        |___maTT                          #RL alg folder for the target tracking environment
            |___core                      #Self-Attention-based Model Architecture
            |___core_behavior             #Used for further evaluation (Ablation D.2.)
            |___dql                       #Soft Double Q-Learning
            |___evaluation                #Evaluation for Main Results
            |___evaluation_behavior       #Used for further evaluation (Ablation D.2.)
            |___modules                   #Self-Attention blocks
            |___replay_buffer             #RL replay buffer for sets
            |___run_script                #**Main run script to do training and evaluation
    |___envs
        |___maTTenv                       #multi-agent target tracking
            |___env
                |___setTracking_v0        #Standard environment (i.e. 4a4t tasks)
                |___setTracking_vGreedy   #Baseline Greedy Heuristic
                |___setTracking_vGru      #Experiment with Gru (Ablation D.3)
                |___setTracking_vkGreedy  #Experiment with Scalability and Heuristic Mask k=4 (Ablation D.1)
        |___run_ma_tracking               #Example scipt to run environment
    |___setup                             #set PYTHONPATH ($source setup)
  • To setup scalableMARL, follow the instruction below.

Set up python environment for the scalableMARL repository

Install python3.8 (if it is not already installed)

#to check python version
python3 -V

sudo apt-get update
sudo apt-get install python3.8-dev

Set up virtualenv

Python virtual environments are used to isolate package installation from the system

Replace 'virtualenv name' with your choice of folder name

sudo apt-get install python3-venv 

python3 -m venv --system-site-packages ./'virtualenv name'
# Activate the environment for use, any commands now will be done within this venv
source ./'virtualenv name'/bin/activate

# To deactivate (in terminal, exit out of venv), do not use during setup
deactivate

Now that the virtualenv is activated, you can install packages that are isolated from your system

When the venv is activated, you can now install packages and run scripts

Install isolated packages in your venv

sudo apt-get install -y eog python3-tk python3-yaml python3-pip ssh git

#This command will auto install packages from requirements.txt
pip3 install --trusted-host pypi.python.org -r requirements.txt

Current workflow

Setup repos

# activate virtualenv
source ./'virtualenv name'/bin/activate
# change directory to scalableMARL
cd ./scalableMARL
# setup repo  ***important in order to set PYTHONPATH***
source setup

scalableMARL repo is ready to go

Running an algorithm (for example maPredPrey)

# its best to run from the scalableMARL folder so that logging and saving is consistent
cd ./scalableMARL
# run the alg
python3 algos/maTT/run_script.py

# you can run the alg with different argument parameters. See within run_script for more options.
# for example
python3 algos/maTT/run_script.py --seed 0 --logdir ./results/maPredPrey --epochs 40

To test, evaluate, and render()

# for a general example 
python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./results/maTT/setTracking-v0_123456789/seed_0/ --nb_test_eps 50
# for a saved policy in saved_results
python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./saved_results/maTT/setTracking-v0_123456789/seed_0/

To see training curves

tensorboard --logdir ./results/maTT/setTracking-v0_123456789/

Citing scalableMARL

If you reference or use scalableMARL in your research, please cite:

@misc{hsu2021scalable,
      title={Scalable Reinforcement Learning Policies for Multi-Agent Control}, 
      author={Christopher D. Hsu and Heejin Jeong and George J. Pappas and Pratik Chaudhari},
      year={2021},
      eprint={2011.08055},
      archivePrefix={arXiv},
      primaryClass={cs.MA}
}

Owner
Christopher Hsu
Christopher Hsu
Repo for "Physion: Evaluating Physical Prediction from Vision in Humans and Machines" submission to NeurIPS 2021 (Datasets & Benchmarks track)

Physion: Evaluating Physical Prediction from Vision in Humans and Machines This repo contains code and data to reproduce the results in our paper, Phy

Cognitive Tools Lab 38 Jan 06, 2023
DualGAN-tensorflow: tensorflow implementation of DualGAN

ICCV paper of DualGAN DualGAN: unsupervised dual learning for image-to-image translation please cite the paper, if the codes has been used for your re

Jack Yi 252 Nov 10, 2022
For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

LongScientificFormer For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training. Some code

Athar Sefid 6 Nov 02, 2022
A LiDAR point cloud cluster for panoptic segmentation

Divide-and-Merge-LiDAR-Panoptic-Cluster A demo video of our method with semantic prior: More information will be coming soon! As a PhD student, I don'

YimingZhao 65 Dec 22, 2022
Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

Tianyu Li 9 Oct 26, 2022
JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces

JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces JAXMAPP is a JAX-based library for multi-agent path planning (MAPP) in c

OMRON SINIC X 24 Dec 28, 2022
SphereFace: Deep Hypersphere Embedding for Face Recognition

SphereFace: Deep Hypersphere Embedding for Face Recognition By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song License SphereFa

Weiyang Liu 1.5k Dec 29, 2022
Training Very Deep Neural Networks Without Skip-Connections

DiracNets v2 update (January 2018): The code was updated for DiracNets-v2 in which we removed NCReLU by adding per-channel a and b multipliers without

Sergey Zagoruyko 585 Oct 12, 2022
PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022
Self Driving RC Car Code

Derp Learning Derp Learning is a Python package that collects data, trains models, and then controls an RC car for track racing. Hardware You will nee

Not Karol 39 Dec 07, 2022
Neurolab is a simple and powerful Neural Network Library for Python

Neurolab Neurolab is a simple and powerful Neural Network Library for Python. Contains based neural networks, train algorithms and flexible framework

152 Dec 06, 2022
Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Expressive Power of Invariant and Equivaraint Graph Neural Networks In this repository, we show how to use powerful GNN (2-FGNN) to solve a graph alig

Marc Lelarge 36 Dec 12, 2022
End-To-End Memory Network using Tensorflow

MemN2N Implementation of End-To-End Memory Networks with sklearn-like interface using Tensorflow. Tasks are from the bAbl dataset. Get Started git clo

Dominique Luna 339 Oct 27, 2022
Graph parsing approach to structured sentiment analysis.

Fine-grained Sentiment Analysis as Dependency Graph Parsing This repository contains the code and datasets described in following paper: Fine-grained

Jeremy Barnes 36 Dec 12, 2022
Leveraging OpenAI's Codex to solve cornerstone problems in Music

Music-Codex Leveraging OpenAI's Codex to solve cornerstone problems in Music Please NOTE: Presented generated samples were created by OpenAI's Codex P

Alex 2 Mar 11, 2022
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r

1 Jan 18, 2022
A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers.

Dying Light 2 PAKFile Utility A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers. This tool aims to make PAKFile (.pak files) modding a

RHQ Online 12 Aug 26, 2022
Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach

Introduction Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach Datasets: WebFG-496

21 Sep 30, 2022
Using BERT+Bi-LSTM+CRF

Chinese Medical Entity Recognition Based on BERT+Bi-LSTM+CRF Step 1 I share the dataset on my google drive, please download the whole 'CCKS_2019_Task1

Xiang WU 55 Dec 21, 2022