Multi-Stage Episodic Control for Strategic Exploration in Text Games

Last update: May 24, 2022

Overview

XTX: eXploit - Then - eXplore

Requirements

First clone this repo using git clone https://github.com/princeton-nlp/XTX.git

Please create two conda environments as follows:

conda env create -f yml_envs/jericho-wt.yml
a. conda activate jericho-wt
b. pip install git+https://github.com/jens321/[email protected]
conda env create -f yml_envs/jericho-no-wt.yml

The first set of commands will create a conda environment called jericho-wt which has added actions to the game grammar for specific games (see games with * in the paper). The second command will create another conda environment called jericho-no-wt which installs an unmodified version of the Jericho library.

Training

All code can be run from the root folder of this project. Please follow the commands below for each specific model:

XTX: sh scripts/run_xtx.sh
XTX (no-mix): sh scripts/run_xtx_no_mix.sh
XTX (uniform): sh scrtips/run_xtx_uniform.sh
XTX ($\lambda$ = 0, 0.5, or 1): sh scripts/run_xtx_ablation.sh
INV DY: sh scripts/run_inv_dy.sh
DRRN: sh scripts/run_drrn.sh

Notes

You can use analysis/sample_env.py for quickly playing around with a sample Jericho environment. Run it using python3 -m analysis.sample_env.
You can use analysis/augment_wt.py for generating the missing action candidates that can be added to the game grammar (games with * in the paper). Run it using python3 -m analysis.augment_wt.
Note that all models should finish within a day or two given 1 gpu and 8 cpus, except for games where Jericho's valid action handicap is slow (e.g. Library, Dragon). Since Jericho's valid action handicap heavily relies on parallelization, increasing the number of cpus also results in good speedups (e.g. 8 -> 16).

Acknowledgements

We used Weights & Biases for experiment tracking and visualizations to develop insights for this paper.

Some of the code borrows from the TDQN repo.

For any questions please contact Jens Tuyls ([email protected]).

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Related tags

Overview

XTX: eXploit - Then - eXplore

Requirements

Training

Notes

Acknowledgements

Owner

Princeton Natural Language Processing

This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios"

Implementation of TimeSformer, a pure attention-based solution for video classification

Implementation of popular bandit algorithms in batch environments.

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

salabim - discrete event simulation in Python

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Probabilistic Cross-Modal Embedding (PCME) CVPR 2021

For IBM Quantum Challenge Africa 2021, 9 September (07:00 UTC) - 20 September (23:00 UTC).

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Reinfore learning tool box, contains trpo, a3c algorithm for continous action space

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

This is the implementation of the paper "Self-supervised Outdoor Scene Relighting"

Repo for code associated with Modeling the Mitral Valve.

make ASCII Art by Deep Learning

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

Code for weakly supervised segmentation of a single class

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"