Active Offline Policy Selection With Python

Overview

Active Offline Policy Selection

This is supporting example code for NeurIPS 2021 paper Active Offline Policy Selection by Ksenia Konyushkova*, Yutian Chen*, Tom Le Paine, Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas.

To simulate the active offline policy selection for a set of policies, one needs to provide a number of files. We provide the files for 76 policies on cartpole_swingup environemnt.

  1. Sampled episodic returns for all policies on a number of evalauation episodes (full-reward-samples-dict.pkl), or a way of sampling a new episode of evaluation upon request for any policy. The file full-reward-samples-dict.pkl contains a dictionary that maps a policy by its string representation to a numpy.ndarray of of shape (5000,) (number of reward samples).

  2. Off-policy evaluation score, such as fitted Q-evaluation (FQE) for all policies (ope_values.pkl). The file ope_values.pkl contains dictionary that maps policy info into OPE estimates. We provide FQE scores for the policies.

  3. Actions that policies take on 1000 randomly sampled states from the offline dataset (actions.pkl). The file actions.pkl contains a dictionary with keys actions and policy_keys. actions is a list of 1000 ( number of states used to compute the kernel) elements of numpy.ndarray type of dimensionality 76x1 (number of policies by the dimensionality of the actions). policy_keys contains a dictionary mapping from string representation of a policy to the index of that policy in actions.

Installation

To set up the virtual environment, run the following commands. From within the active_ops directory:

python3 -m venv active_ops_env
source active_ops_env/bin/activate

pip install --upgrade pip
pip install -r requirements.txt

To run the demo with colab, enable the jupyter_http_over_ws extension:

jupyter serverextension enable --py jupyter_http_over_ws

Finally, start a server:

jupyter notebook \
  --NotebookApp.allow_origin='https://colab.research.google.com' \
  --port=8888 \
  --NotebookApp.port_retries=0

Usage

To run the code refer to Active_ops_experiment.ipynb colab notebook. Execute blocks of code one by one to reproduce the final plot. You can modify various parameters maked by @param to test various baselines in modified settings. This code loads the example of data for cartpole_environment provided in the data folder. Using this data, we reproduce the results of Figure 14 of the paper.

Citing this work

@inproceedings{konyushkovachen2021aops,
    title = "Active Offline Policy Selection",
    author = "Ksenia Konyushkova, Yutian Chen, Tom Le Paine, Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas",
    booktitle = NeurIPS,
    year = 2021
}

Disclaimer

This is not an official Google product.

The datasets in this work are licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit [http://creativecommons.org/licenses/by/4.0/] (http://creativecommons.org/licenses/by/4.0/).

Owner
DeepMind
DeepMind
Baseline inference Algorithm for the STOIC2021 challenge.

STOIC2021 Baseline Algorithm This codebase contains an example submission for the STOIC2021 COVID-19 AI Challenge. As a baseline algorithm, it impleme

Luuk Boulogne 10 Aug 08, 2022
Trained on Simulated Data, Tested in the Real World

Trained on Simulated Data, Tested in the Real World

livox 43 Nov 18, 2022
PAWS 🐾 Predicting View-Assignments with Support Samples

This repo provides a PyTorch implementation of PAWS (predicting view assignments with support samples), as described in the paper Semi-Supervised Learning of Visual Features by Non-Parametrically Pre

Facebook Research 437 Dec 23, 2022
A PyTorch Implementation of FaceBoxes

FaceBoxes in PyTorch By Zisian Wong, Shifeng Zhang A PyTorch implementation of FaceBoxes: A CPU Real-time Face Detector with High Accuracy. The offici

Zi Sian Wong 797 Dec 17, 2022
[ECCV 2020] XingGAN for Person Image Generation

Contents XingGAN or CrossingGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evaluation Acknowl

Hao Tang 218 Oct 29, 2022
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video šŸ”„ | Colab demo Deep Exemplar-based Video Col

Bo Zhang 253 Dec 27, 2022
Shitty gaze mouse controller

demo.mp4 shitty_gaze_mouse_cotroller install tensofflow, cv2 run the main.py and as it starts it will collect data so first raise your left eyebrow(bo

16 Aug 30, 2022
Code-free deep segmentation for computational pathology

NoCodeSeg: Deep segmentation made easy! This is the official repository for the manuscript "Code-free development and deployment of deep segmentation

AndrƩ Pedersen 26 Nov 23, 2022
Simulations for Turring patterns on an apically expanding domain. T

Turing patterns on expanding domain Simulations for Turring patterns on an apically expanding domain. The details about the models and numerical imple

Yue Liu 0 Aug 03, 2021
PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

MINE: Continuous-Depth MPI with Neural Radiance Fields Project Page | Video PyTorch implementation for our ICCV 2021 paper. MINE: Towards Continuous D

Zijian Feng 325 Dec 29, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

102 Dec 25, 2022
Official repo for QHack—the quantum machine learning hackathon

Note: This repository has been frozen while we consider the submissions for the QHack Open Hackathon. We hope you enjoyed the event! Welcome to QHack,

Xanadu 118 Jan 05, 2023
Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation

Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation Overview This example will show how to validate the status of our firewall before and a

Calvin Remsburg 1 Jan 07, 2022
Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

extrinsic2pyramid Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space Intro A very simple and straightforward modu

JEONG HYEONJIN 106 Dec 28, 2022
The versatile ocean simulator, in pure Python, powered by JAX.

Veros is the versatile ocean simulator -- it aims to be a powerful tool that makes high-performance ocean modeling approachable and fun. Because Veros

TeamOcean 245 Dec 20, 2022
Continual Learning of Long Topic Sequences in Neural Information Retrieval

ContinualPassageRanking Repository for the paper "Continual Learning of Long Topic Sequences in Neural Information Retrieval". In this repository you

0 Apr 12, 2022
PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.

ALiBi PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation. Quickstart Clone this reposit

Jake Tae 4 Jul 27, 2022
The project of phase's key role in complex and real NN

Phase-in-NN This is the code for our project at Princeton (co-authors: Yuqi Nie, Hui Yuan). The paper title is: "Neural Network is heterogeneous: Phas

YuqiNie-lab 1 Nov 04, 2021
School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

Thyrix 5 Nov 23, 2022
The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue.

The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue. How do I cite D-REX? For now, cite

Alon Albalak 6 Mar 31, 2022