Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images

This script runs neural style transfer against the provided content image.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021).

This code is 3d-CNN model that can predict environmental value

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

Source code for our paper "Empathetic Response Generation with State Management"

CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).

Official code repository for the EMNLP 2021 paper

This repo is duplication of jwyang/faster-rcnn.pytorch

Text-Based Ideal Points

[ WSDM '22 ] On Sampling Collaborative Filtering Datasets

A Python package to create, run, and post-process MODFLOW-based models.

Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data

Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

Code for the paper "On the Power of Edge Independent Graph Models"

📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".