This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Overview

Off-Belief Learning

Introduction

This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Environment Setup

We have been using pytorch-1.5.1, cuda-10.1, and cudnn-v7.6.5 in our development environment. Other settings may also work but we have not tested it extensively under different configurations. We also use conda/miniconda to manage environments.

There are known issues when using this repo with newer versions of pytorch, such as this illegal move issue.

conda create -n hanabi python=3.7
conda activate hanabi

# install pytorch 1.5.1
# note that newer versions may cause compilation issues
pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html

# install other dependencies
pip install psutil

# install a newer cmake if the current version is < 3.15
conda install -c conda-forge cmake

To help cmake find the proper libraries (e.g. libtorch), please either add the following lines to your .bashrc, or add it to a separate file and source it before you start working on the project.

# activate the conda environment
conda activate hanabi

# set path
CONDA_PREFIX=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
export CPATH=${CONDA_PREFIX}/include:${CPATH}
export LIBRARY_PATH=${CONDA_PREFIX}/lib:${LIBRARY_PATH}
export LD_LIBRARY_PATH=${CONDA_PREFIX}/lib:${LD_LIBRARY_PATH}

# avoid tensor operation using all cpu cores
export OMP_NUM_THREADS=1

Finally, to compile this repo:

# under project root
mkdir build
cd build
cmake ..
make -j10

Code Structure

For an overview of how the training infrastructure, please refer to Figure 5 of the Off-Belief Learning paper.

hanabi-learning-environment is a modified version of the original HLE from Deepmind.

Notable modifications includes:

  1. Card knowledge part of the observation encoding is changed to v0-belief, i.e. card knowledge normalized by the remaining public card count.

  2. Functions to reset the game state with sampled hands.

rela (REinforcement Learning Assemly) is a set of tools for efficient batched neural network inference written in C++ with multi-threading.

rlcc implements the core of various algorithms. For example, the logic of fictitious transitions are implemented in r2d2_actor.cc. It also contains implementations of baselines such as other-play, VDN and IQL.

pyhanabi is the main entry point of the repo. It contains implementations for Q-network, recurrent DQN training, belief network and training, as well as some tools to analyze trained models.

Run the Code

Please refer to the README in pyhanabi for detailed instruction on how to train a model.

Download Models

To download the trained models used in the paper, go to models folder and run

sh download.sh

Due to agreement with BoardGameArena and Facebook policies, we are unable to release the "Clone Bot" models trained on the game data nor the datasets themselves.

Copyright

Copyright (c) Facebook, Inc. and its affiliates. All rights reserved.

This source code is licensed under the license found in the LICENSE file in the root directory of this source tree.

Owner
Facebook Research
Facebook Research
A library for efficient similarity search and clustering of dense vectors.

Faiss Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any

Meta Research 18.8k Jan 08, 2023
🄈78th place in Riiid Answer Correctness Prediction competition

Riiid Answer Correctness Prediction Introduction This repository is the code that placed 78th in Riiid Answer Correctness Prediction competition. Requ

Jungwoo Park 10 Jul 14, 2022
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ

Jonathan Shobrook 305 Dec 21, 2022
3rd place solution for the Weather4cast 2021 Stage 1 Challenge

weather4cast2021_Stage1 3rd place solution for the Weather4cast 2021 Stage 1 Challenge Dependencies The code can be executed from a fresh environment

5 Aug 14, 2022
A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.

Evolution Gym A large-scale benchmark for co-optimizing the design and control of soft robots. As seen in Evolution Gym: A Large-Scale Benchmark for E

121 Dec 14, 2022
Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

BaseSAFE This repository contains the BaseSAFE Rust APIs, introduced by "BaseSAFE: Baseband SAnitized Fuzzing through Emulation". The example/ directo

Security in Telecommunications 138 Dec 16, 2022
IntelĀ® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

IntelĀ® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
An example of Scatterbrain implementation (combining local attention and Performer)

An example of Scatterbrain implementation (combining local attention and Performer)

HazyResearch 97 Jan 02, 2023
SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

CORNELLSASLAB SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab Instructions: This python code can be used to convert SAS out

2 Jan 26, 2022
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training

ColossalAI An integrated large-scale model training system with efficient parallelization techniques. arXiv: Colossal-AI: A Unified Deep Learning Syst

HPC-AI Tech 7.9k Jan 08, 2023
A machine learning project which can detect and predict the skin disease through image recognition.

ML-Project-2021 A machine learning project which can detect and predict the skin disease through image recognition. The dataset used for this is the H

Debshishu Ghosh 1 Jan 13, 2022
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 04, 2023
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detecti

Bhchen 69 Dec 08, 2022
[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

[CVPR2021] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator Overview This is the entire codebase for the paper

35 Dec 01, 2022
Compact Bidirectional Transformer for Image Captioning

Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --

YE Zhou 19 Dec 12, 2022
Inhomogeneous Social Recommendation with Hypergraph Convolutional Networks

Inhomogeneous Social Recommendation with Hypergraph Convolutional Networks This is our Pytorch implementation for the paper: Zirui Zhu, Chen Gao, Xu C

Zirui Zhu 3 Dec 30, 2022
[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Self-Supervised Learning of Image Scale and Orientation Estimation (BMVC 2021) This is the official implementation of the paper "Self-Supervised Learn

Jongmin Lee 17 Nov 10, 2022
[ICCV 2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation

ADDS-DepthNet This is the official implementation of the paper Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation I

LIU_LINA 52 Nov 24, 2022
This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision"

RUAS This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision" A prelimin

Vision & Optimization Group (VOG) 2 May 05, 2022