Multi-Objective Reinforced Active Learning

Last update: Nov 19, 2022

Related tags

Deep Learning moral_rl

Overview

Multi-Objective Reinforced Active Learning

Dependencies

wandb
tqdm
pytorch >= 1.7.0
numpy >= 1.20.0
scipy >= 1.1.0
pycolab == 1.2

Weights and Biases

Our code depends on for visualizing and logging results during training. As a result, we call wandb.init(), which will prompt to add an API key for linking the training runs with your personal wandb account. This can be done by pasting the WANDB_API_KEY into the respective box when running the code for the first time.

Environments

Our gridworlds (Emergency: randomized_v2.py, Delivery: randomized_v3.py) build on the game engine with a custom wrapper to provide similar functionality as the gym . This engine comes with a user interface and any environment can be played in the console using python environment.py with arrow keys and w, a, s, d as controls.

Training

There are four training scripts for

manually training a PPO agent on custom rewards (ppo_train.py),
training AIRL on a single expert dataset (airl_train.py),
active MORL with custom/automatic preferences (moral_train.py) and
training DRLHP with custom/automatic preferences (drlhp_train.py).

When using automatic preferences, a desired ratio can be passed as an argument. For example,

python moral_train.py --ratio a b c

will run MORAL using a (real-valued) ratio of a:b:c among the three explicit objectives in Delivery.

Hyperparameters

Hyperparameters are passed as arguments to wandb.init() and can be changed by modifying the respective training files.

Multi-Objective Reinforced Active Learning

Related tags

Overview

Multi-Objective Reinforced Active Learning

Dependencies

Weights and Biases

Environments

Training

Hyperparameters

Owner

Markus Peschl

PyTorch implementation of the REMIND method from our ECCV-2020 paper "REMIND Your Neural Network to Prevent Catastrophic Forgetting"

PyTorch - Python + Nim

Text Generation by Learning from Demonstrations

Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.

🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

An end-to-end framework for mixed-integer optimization with data-driven learned constraints.

TensorFlow tutorials and best practices.

High-Resolution Image Synthesis with Latent Diffusion Models

Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

ConvMAE: Masked Convolution Meets Masked Autoencoders

Orchestrating Distributed Materials Acceleration Platform Tutorial

Code examples and benchmarks from the paper "Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective"

Reinforcement learning library in JAX.

Code for training and evaluation of the model from "Language Generation with Recurrent Generative Adversarial Networks without Pre-training"

Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

Official repository for the paper "Instance-Conditioned GAN"

Self-supervised Deep LiDAR Odometry for Robotic Applications

Godot RL Agents is a fully Open Source packages that allows video game creators