DrQ-v2: Improved Data-Augmented Reinforcement Learning

Last update: Jan 01, 2023

Related tags

Overview

DrQ-v2: Improved Data-Augmented RL Agent

Method

DrQ-v2 is a model-free off-policy algorithm for image-based continuous control. DrQ-v2 builds on DrQ, an actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements including:

Switch the base RL learner from SAC to DDPG.
Incorporate n-step returns to estimate TD error.
Introduce a decaying schedule for exploration noise.
Make implementation 3.5 times faster.
Find better hyper-parameters.

These changes allow us to significantly improve sample efficiency and wall-clock training time on a set of challening tasks from the DeepMind Control Suite compared to prior methods. Furthermore, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL.

Citation

If you use this repo in your research, please consider citing the paper as follows:

@article{yarats2021drqv2,
  title={Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning},
  author={Denis Yarats and Rob Fergus and Alessandro Lazaric and Lerrel Pinto},
  journal={arXiv preprint arXiv:},
  year={2021}
}

Instructions

Install dependencies:

conda env create -f conda_env.yml
conda activate drqv2

Train the agent:

python train.py task=quadruped_walk

Monitor results:

tensorboard --logdir exp_local

License

The majority of DrQ-v2 is licensed under the MIT license, however portions of the project are available under separate license terms: DeepMind is licensed under the Apache 2.0 license.

DrQ-v2: Improved Data-Augmented Reinforcement Learning

Related tags

Overview

DrQ-v2: Improved Data-Augmented RL Agent

Method

Citation

Instructions

License

Owner

Facebook Research

A Flexible Generative Framework for Graph-based Semi-supervised Learning (NeurIPS 2019)

unet-family: Ultimate version

PyTorch implementation of Deformable Convolution

This is the repository for our paper Ditch the Gold Standard: Re-evaluating Conversational Question Answering

Existing Literature about Machine Unlearning

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

NeurIPS 2021, "Fine Samples for Learning with Noisy Labels"

Powerful and efficient Computer Vision Annotation Tool (CVAT)

This project is used for the paper Differentiable Programming of Isometric Tensor Network

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

Detecting drunk people through thermal images using Deep Learning (CNN)

Automatic Video Captioning Evaluation Metric --- EMScore

DeepGNN is a framework for training machine learning models on large scale graph data.

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

Encode and decode text application

This codebase proposes modular light python and pytorch implementations of several LiDAR Odometry methods