Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Last update: Nov 22, 2022

Related tags

Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

This is the official repository for Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning. We provide the commands to run the PETS and PlaNet experiments included in the paper. This repository is made minimal for ease of experimentation.

Installations

This repository requires Python (3.6), Pytorch (version 1.3 or above) run the following command to create a conda environment (tested using CUDA10.2):

conda env create -f environment.yml

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

cd cap-pets

CAP

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --learn_kappa --seed 1

CAP with fixed kappa

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --kappa 1.0 --seed 1

CCEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --seed 1

CEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--seed 1

The commands for the PlaNet experiment on the CarRacing environment are:

CAP

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--learn-kappa --penalty-kappa 0.1 \
--id CarRacing-cap --seed 1

CAP with fixed kappa

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--penalty-kappa 1.0 \
--id CarRacing-kappa1 --seed 1

CCEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained \
--id CarRacing-ccem --seed 1

CEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--id CarRacing-cem --seed 1

Contact

If you have any questions regarding the code or paper, feel free to contact [email protected] or open an issue on this repository.

Acknowledgement

This repository contains code adapted from the following repositories: PETS and PlaNet. We thank the authors and contributors for open-sourcing their code.

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Related tags

Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Installations

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

The commands for the PlaNet experiment on the CarRacing environment are:

Contact

Acknowledgement

Owner

Python package for downloading ECMWF reanalysis data and converting it into a time series format.

Users can free try their models on SIDD dataset based on this code

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

A micro-game "flappy bird".

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021

TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.

This repository is for our EMNLP 2021 paper "Automated Generation of Accurate & Fluent Medical X-ray Reports"

[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

UMich 500-Level Mobile Robotics Course

A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

The official implementation of CircleNet: Anchor-free Detection with Circle Representation, MICCAI 2030

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

Code for "NeRS: Neural Reflectance Surfaces for Sparse-View 3D Reconstruction in the Wild," in NeurIPS 2021