OpenAi's gym environment wrapper to vectorize them with Ray

Overview

Ray Vector Environment Wrapper

You would like to use Ray to vectorize your environment but you don't want to use RLLib ?
You came to the right place !

This package allows you to parallelize your environment using Ray
Not only does it allows to run environments in parallel, but it also permits to run multiple sequential environments on each worker
For example, you can run 80 workers in parallel, each running 10 sequential environments for a total of 80 * 10 environments
This can be useful if your environment is fast and simply running 1 environment per worker leads to too much communication overhead between workers

Installation

pip install RayEnvWrapper

If something went wrong, it most certainly is because of Ray
For example, you might have issue installing Ray on Apple Silicon (i.e., M1) laptop. See Ray's documentation for a simple fix
At the moment Ray does not support Python 3.10. This package has been tested with Python 3.9.

How does it work?

You first need to define a function that seed and return your environment:

Here is an example for CartPole:

import gym

def make_and_seed(seed: int) -> gym.Env:
    env = gym.make('CartPole-v0')
    env = gym.wrappers.RecordEpisodeStatistics(env) # you can put extra wrapper to your original environment
    env.seed(seed)
    return env

Note: If you don't want to seed your environment, simply return it without using the seed, but the function you define needs to take a number as an input

Then, call the wrapper to create and wrap all the vectorized environment:

from RayEnvWrapper import WrapperRayVecEnv

number_of_workers = 4 # Usually, this is set to the number of CPUs in your machine
envs_per_worker = 2

vec_env = WrapperRayVecEnv(make_and_seed, number_of_workers, envs_per_worker)

You can then use your environment. All the output for each of the environments are stacked in a numpy array

Reset:

vec_env.reset()

Output

[[ 0.03073904  0.00145001 -0.03088818 -0.03131252]
 [ 0.03073904  0.00145001 -0.03088818 -0.03131252]
 [ 0.02281231 -0.02475473  0.02306162  0.02072129]
 [ 0.02281231 -0.02475473  0.02306162  0.02072129]
 [-0.03742824 -0.02316945  0.0148571   0.0296055 ]
 [-0.03742824 -0.02316945  0.0148571   0.0296055 ]
 [-0.0224773   0.04186813 -0.01038048  0.03759079]
 [-0.0224773   0.04186813 -0.01038048  0.03759079]]

The i-th entry represent the initial observation of the i-th environment
Note: As environments are vectorized, you don't need explicitly to reset the environment at the end of the episode, it is done automatically However, you need to do it once at the beginning

Take a random action:

vec_env.step([vec_env.action_space.sample() for _ in range(number_of_workers * envs_per_worker)])

Notice how the actions are passed. We pass an array containing an action for each of the environments
Thus, the array is of size number_of_workers * envs_per_worker (i.e., the total number of environments)

Output

(array([[ 0.03076804, -0.19321568, -0.03151444,  0.25146705],
       [ 0.03076804, -0.19321568, -0.03151444,  0.25146705],
       [ 0.02231721, -0.22019969,  0.02347605,  0.3205903 ],
       [ 0.02231721, -0.22019969,  0.02347605,  0.3205903 ],
       [-0.03789163, -0.21850128,  0.01544921,  0.32693872],
       [-0.03789163, -0.21850128,  0.01544921,  0.32693872],
       [-0.02163994, -0.15310344, -0.00962866,  0.3269806 ],
       [-0.02163994, -0.15310344, -0.00962866,  0.3269806 ]],
      dtype=float32), 
 array([1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32), 
 array([False, False, False, False, False, False, False, False]), 
 [{}, {}, {}, {}, {}, {}, {}, {}])

As usual, the step method returns a tuple, except that here both the observation, reward, dones and infos are concatenated
In this specific example, we have 2 environments per worker.
Index 0 and 1 are environments from worker 1; index 1 and 2 are environments from worker 2, etc.

License

Apache License 2.0

You might also like...
A
A "gym" style toolkit for building lightweight Neural Architecture Search systems

A "gym" style toolkit for building lightweight Neural Architecture Search systems

Customizable RecSys Simulator for OpenAI Gym
Customizable RecSys Simulator for OpenAI Gym

gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac

Robot Servers and Server Manager software for robo-gym

robo-gym-server-modules Robot Servers and Server Manager software for robo-gym. For info on how to use this package please visit the robo-gym website

Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Manipulator Learning This repository contains a set of manipulation environments that are compatible with OpenAI Gym and simulated in pybullet. In par

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

Multi-objective gym environments for reinforcement learning.
Multi-objective gym environments for reinforcement learning.

MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo

Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning accelerators for distributed training using the Ray distributed

Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning plugins for distributed training using the Ray distributed compu

Comments
  • envs_per_worker

    envs_per_worker

    Hi!@ingambe. Thank you very much for your work! I have some questions. What does the "worker and envs" mean here? My understanding is as follows:

    • Worker represents a process. Two env in a worker belong to two threads.

    I don't know if I understand this correctly. Thanks! image

    opened by Meta-YZ 2
  • how to wrap two DIFFERENT environments?

    how to wrap two DIFFERENT environments?

    Thank you for upload the package. My question is is there a way to stack different environments together? For example I have ten or hundreds different race track environments and I want to train an agent simultaneously drive through this vectorized environment. In stable baseline I can stack them together and train a vectorized environment. Now I want to move to ray and try to speed up the training by using multiple gpu...but so far didn't figure out how to do this. Thanks in advance

    enhancement 
    opened by superfan123 1
Releases(v1.0)
Owner
Pierre TASSEL
Pierre TASSEL
A simple implementation of Kalman filter in Multi Object Tracking

kalman Filter in Multi-object Tracking A simple implementation of Kalman filter in Multi Object Tracking 本实现是在https://github.com/liuchangji/kalman-fil

124 Dec 29, 2022
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

ASAPP Research 49 Oct 09, 2022
EssentialMC2 Video Understanding

EssentialMC2 Introduction EssentialMC2 is a complete system to solve video understanding tasks including MHRL(representation learning), MECR2( relatio

Alibaba 106 Dec 11, 2022
Code for generating the figures in the paper "Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?"

Code for running simulations for the paper "Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Lin

Matthew Farrell 1 Nov 22, 2022
OpenIPDM is a MATLAB open-source platform that stands for infrastructures probabilistic deterioration model

Open-Source Toolbox for Infrastructures Probabilistic Deterioration Modelling OpenIPDM is a MATLAB open-source platform that stands for infrastructure

CIVML 0 Jan 20, 2022
Vector Quantization, in Pytorch

Vector Quantization - Pytorch A vector quantization library originally transcribed from Deepmind's tensorflow implementation, made conveniently into a

Phil Wang 665 Jan 08, 2023
torchbearer: A model fitting library for PyTorch

Note: We're moving to PyTorch Lightning! Read about the move here. From the end of February, torchbearer will no longer be actively maintained. We'll

632 Dec 13, 2022
Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression. Not an official Google product. Me

Google Research 27 Dec 12, 2022
Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

9 Nov 08, 2022
SIMULEVAL A General Evaluation Toolkit for Simultaneous Translation

SimulEval SimulEval is a general evaluation framework for simultaneous translation on text and speech. Requirement python = 3.7.0 Installation git cl

Facebook Research 48 Dec 28, 2022
A parallel framework for population-based multi-agent reinforcement learning.

MALib: A parallel framework for population-based multi-agent reinforcement learning MALib is a parallel framework of population-based learning nested

MARL @ SJTU 348 Jan 08, 2023
ComputerVision - This repository aims at realized easy network architecture

ComputerVision This repository aims at realized easy network architecture Colori

DongDong 4 Dec 14, 2022
This repository contains an implementation of the Permutohedral Attention Module in Pytorch

Permutohedral_attention_module This repository contains an implementation of the Permutohedral Attention Module

Samuel JOUTARD 26 Nov 27, 2022
Code for "Layered Neural Rendering for Retiming People in Video."

Layered Neural Rendering in PyTorch This repository contains training code for the examples in the SIGGRAPH Asia 2020 paper "Layered Neural Rendering

Google 154 Dec 16, 2022
University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

Music-Sentiment-Transfer University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN Poster: Music Sentiment Transfer

Miles Sigel 2 Jan 24, 2022
Tensorflow2 Keras-based Semantic Segmentation Models Implementation

Tensorflow2 Keras-based Semantic Segmentation Models Implementation

Hah Min Lew 1 Feb 08, 2022
Teaching end to end workflow of deep learning

Deep-Education This repository is now available for public use for teaching end to end workflow of deep learning. This implies that learners/researche

Data Lab at College of William and Mary 2 Sep 26, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentation"

Hyper-Convolution Networks for Biomedical Image Segmentation Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentatio

Tianyu Ma 17 Nov 02, 2022
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022