Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021

Related tags

Deep LearningPPGS
Overview

PPGS: Planning from Pixels in Environments with Combinatorially Hard Search Spaces

PPGS Overview

Environment Setup

  • We recommend pipenv for creating and managing virtual environments (dependencies for other environment managers can be found in Pipfile)
git clone https://github.com/martius-lab/PPGS
cd ppgs
pipenv install
pipenv shell
  • For simplicity, this codebase is ready for training on two of the three environments (IceSlider and DigitJump). They are part of the puzzlegen package, which we provide here, and can be simply installed with
pip install -e https://github.com/martius-lab/puzzlegen
  • Offline datasets can be generated for training and validation. In the case of IceSlider we can use
python -m puzzlegen.extract_trajectories --record-dir /path/to/train_data --env-name ice_slider --start-level 0 --number-levels 1000 --max-steps 20 --n-repeat 20 --random 1
python -m puzzlegen.extract_trajectories --record-dir /path/to/test_data --env-name ice_slider --start-level 1000 --number-levels 1000 --max-steps 20 --n-repeat 5 --random 1
  • Finally, we can add the paths to the extracted datasets in default_params.json as data_params.train_path and data_params.test_path. We should also set the name of the environment for validation in data_params.env_name ("ice_slider" for IceSlider or "digit_jump" for DigitJump).

  • Training and evaluation are performed sequentially by running

python main.py

Configuration

All settings can be handled by editing default_config.json.

Param Default Info
optimizer_params.eps 1e-05 epsilon for Adam
train_params.seed null seed for training
train_params.epochs 40 # of training epochs
train_params.batch_size 128 batch size for training
train_params.save_every_n_epochs 5 how often to save models
train_params.val_every_n_epochs 2 how often to perform validation
train_params.lr_dict - dictionary of learning rates for each component
train_params.loss_weight_dict - dictionary of weights for the three loss functions
train_params.margin 0.1 latent margin epsilon
train_params.hinge_params - hyperparameters for margin loss
train_params.schedule [] learning rate schedule
model_params.name 'ppgs' name of the model to train in ['ppgs', 'latent']
model_params.load_model true whether to load saved model if present
model_params.filters [64, 128, 256, 512] encoder filters
model_params.embedding_size 16 dimensionality of latent space
model_params.normalize true whether to normalize embeddings
model_params.forward_layers 3 layers in MLP forward model for 'latent' world model
model_params.forward_units 256 units in MLP forward model for 'latent' world model
model_params.forward_ln true layer normalization in MLP forward model for 'latent' world model
model_params.inverse_layers 1 layers in MLP inverse model
model_params.inverse_units 32 units in MLP inverse model
model_params.inverse_ln true layer normalization in MLP inverse model
data_params.train_path '' path to training dataset
data_params.test_path '' path to validation dataset
data_params.env_name 'ice_slider' name of environment ('ice_slider' for IceSlider, 'digit_jump' for DigitJump
data_params.seq_len 2 number of steps for multi-step loss
data_params.shuffle true whether to shuffle datasets
data_params.normalize true whether to normalize observations
data_params.encode_position false enables positional encoding
data_params.env_params {} params to pass to environment
eval_params.evaluate_losses true whether to compute evaluation losses
eval_params.evaluate_rollouts true whether to compute solution rates
eval_params.eval_at [1,3,4] # of steps to evaluate at
eval_params.latent_eval_at [1,5,10] K for latent metrics
eval_params.seeds [2000] starting seed for evaluation levels
eval_params.num_levels 100 # evaluation levels
eval_params.batch_size 128 batch size for latent metrics evaluation
eval_params.planner_params.batch_size 256 cutoff for graph search
eval_params.planner_params.margin 0.1 latent margin for reidentification
eval_params.planner_params.early_stop true whether to stop when goal is found
eval_params.planner_params.backtrack false enables backtracking algorithm
eval_params.planner_params.penalize_visited false penalizes visited vertices in graph search
eval_params.planner_params.eps 0 enables epsilon greedy action selection
eval_params.planner_params.max_steps 256 maximal solution length
eval_params.planner_params.replan horizon 10 T_max for full planner
eval_params.planner_params.snap false snaps new vertices to visited ones
working_dir "results/ppgs" directory for checkpoints and results
Owner
Autonomous Learning Group
Autonomous Learning Group
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Ro

Meta Research 1.2k Jan 02, 2023
A standard framework for modelling Deep Learning Models for tabular data

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.

801 Jan 08, 2023
Self-training for Few-shot Transfer Across Extreme Task Differences

Self-training for Few-shot Transfer Across Extreme Task Differences (STARTUP) Introduction This repo contains the official implementation of the follo

Cheng Perng Phoo 33 Oct 31, 2022
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Introduction QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and

Yu 1.4k Dec 30, 2022
constructing maps of intellectual influence from publication data

Influencemap Project @ ANU Influence in the academic communities has been an area of interest for researchers. This can be seen in the popularity of a

CS Metrics 13 Jun 18, 2022
🤗 Push your spaCy pipelines to the Hugging Face Hub

spacy-huggingface-hub: Push your spaCy pipelines to the Hugging Face Hub This package provides a CLI command for uploading any trained spaCy pipeline

Explosion 30 Oct 09, 2022
Unofficial PyTorch implementation of TokenLearner by Google AI

tokenlearner-pytorch Unofficial PyTorch implementation of TokenLearner by Ryoo et al. from Google AI (abs, pdf) Installation You can install TokenLear

Rishabh Anand 46 Dec 20, 2022
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022
StackNet is a computational, scalable and analytical Meta modelling framework

StackNet This repository contains StackNet Meta modelling methodology (and software) which is part of my work as a PhD Student in the computer science

Marios Michailidis 1.3k Dec 15, 2022
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Zhedong Zheng 3.5k Jan 08, 2023
Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

NLP_0-project Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures1. We are a "democratic" and c

3 Mar 16, 2022
Codebase for the Summary Loop paper at ACL2020

Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training

Canny Lab @ The University of California, Berkeley 44 Nov 04, 2022
This repository contains the implementation of the paper: Federated Distillation of Natural Language Understanding with Confident Sinkhorns

Federated Distillation of Natural Language Understanding with Confident Sinkhorns This repository provides an alternative method for ensembled distill

Deep Cognition and Language Research (DeCLaRe) Lab 11 Nov 16, 2022
[ICCV '21] In this repository you find the code to our paper Keypoint Communities

Keypoint Communities In this repository you will find the code to our ICCV '21 paper: Keypoint Communities Duncan Zauss, Sven Kreiss, Alexandre Alahi,

Duncan Zauss 262 Dec 13, 2022
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
验证码识别 深度学习 tensorflow 神经网络

captcha_tf2 验证码识别 深度学习 tensorflow 神经网络 使用卷积神经网络,对字符,数字类型验证码进行识别,tensorflow使用2.0以上 目前项目还在更新中,诸多bug,欢迎提出issue和PR, 希望和你一起共同完善项目。 实例demo 训练过程 优化器选择: Adam

5 Apr 28, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

George Gunter 4 Nov 14, 2022
A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration.

A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration. Introduction spinor-gpe is high-level,

2 Sep 20, 2022
Fake-user-agent-traffic-geneator - Python CLI Tool to generate fake traffic against URLs with configurable user-agents

Fake traffic generator for Gartner Demo Generate fake traffic to URLs with custo

New Relic Experimental 3 Oct 31, 2022