Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

Code for reproducible experiments presented in KSD Aggregated Goodness-of-fit Test.

This repository introduces a short project about Transfer Learning for Classification of MRI Images.

DGCNN - Dynamic Graph CNN for Learning on Point Clouds

🔀 Visual Room Rearrangement

code release for USENIX'22 paper `On the Security Risks of AutoML`

Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

Generative Adversarial Text to Image Synthesis

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Code for our TKDE paper "Understanding WeChat User Preferences and “Wow” Diffusion"

Open-source Monocular Python HawkEye for Tennis

Implementation of "Large Steps in Inverse Rendering of Geometry"

Instance-level Image Retrieval using Reranking Transformers

Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation

Implementation of the state-of-the-art vision transformers with tensorflow

Deep Learning as a Cloud API Service.

DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.