Experiments with differentiable stacks and queues in PyTorch

Related tags

Deep LearningStackNN
Overview

Please use stacknn-core instead!


StackNN

This project implements differentiable stacks and queues in PyTorch. The data structures are implemented in such a way that it should be easy to integrate them into your own models. For example, to construct a differentiable stack and perform a push:

from StackNN.structs import Stack
stack = Stack(BATCH_SIZE, STACK_VECTOR_SIZE)
read_vectors = stack(value_vectors, pop_strengths, push_strengths)

For examples of more complex use cases of this library, refer to the industrial-stacknns repository.

All the code in this repository is associated with the paper Context-Free Transductions with Neural Stacks, which appeared at the Analyzing and Interpreting Neural Networks for NLP workshop at EMNLP 2018. Refer to our paper for more theoretical background on differentiable data structures.

Running a demo

Check example.ipynb for the most up-to-date demo code.

There are several experiment configurations pre-defined in configs.py. To train a model on one of these configs, do:

python run.py CONFIG_NAME

For example, to train a model on the string reversal task:

python run.py final_reverse_config

In addition to the experiment configuration argument, run.py takes several flags:

  • --model: Model type (BufferedModel or VanillaModel)
  • --controller: Controller type (LinearSimpleStructController, LSTMSimpleStructController, etc.)
  • --struct: Struct type (Stack, NullStruct, etc.)
  • --savepath: Path for saving a trained model
  • --loadpath: Path for loading a model

Documentation

You can find auto-generated documentation here.

Contributing

This project is managed by Computational Linguistics at Yale. We welcome contributions from outside in the form of pull requests. Please report any bugs in the GitHub issues tracker. If you are a Yale student interested in joining our lab, please contact Bob Frank.

Citations

If you use this codebase in your research, please cite the associated paper:

@inproceedings{hao-etal-2018-context,
    title = "Context-Free Transductions with Neural Stacks",
    author = "Hao, Yiding  and
      Merrill, William  and
      Angluin, Dana  and
      Frank, Robert  and
      Amsel, Noah  and
      Benz, Andrew  and
      Mendelsohn, Simon",
    booktitle = "Proceedings of the 2018 {EMNLP} Workshop {B}lackbox{NLP}: Analyzing and Interpreting Neural Networks for {NLP}",
    month = nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/W18-5433",
    pages = "306--315",
    abstract = "This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex stack-augmented networks often find approximate solutions by using the stack as unstructured memory.",
}

Dependencies

The core implementation of the data structures is stable in Python 2 and 3. The specific tasks that we have implemented require Python 2.7. We use PyTorch version 0.4.1, with the following additional dependencies:

  • numpy
  • scipy (for data processing)
  • matplotlib (for visualization)
  • nltk

Using pip or conda should suffice for installing most of these dependencies. To get the right command for installing PyTorch, refer to the installation widget on the PyTorch website.

Models

A model is a pairing of a controller network with a neural data structure. There are two kinds of models:

  • models.VanillaModel is a simple controller-data structure network. This means there will be one step of computation per input.
  • models.BufferedModel adds input and output buffers to the vanilla model. This allows the network to run for extra computation steps.

To use a model, call model.forward() on every input and model.init_controller() whenever you want to reset the stack between inputs. You can find example training logic in the tasks package.

Data structures

  • structs.Stack implements the differentiable stack data structure.
  • structs.Queue implements the differentiable queue data structure.

The buffered models use read-only and write-only versions of the differentiable queue for their input and output buffers.

Tasks

The Task class defines specific tasks that models can be trained on. Below are some formal language tasks that we have explored using stack models.

String reversal

The ReverseTask trains a feed-forward controller network to do string reversal. The code generates 800 random binary strings which the network must reverse in a sequence-to-sequence fashion:

Input:   1 1 0 1 # # # #
Label:   # # # # 1 0 1 1

By 10 epochs, the model tends to achieve 100% accuracy. The config for this task is called final_reverse_config.

Context-free language modelling

CFGTask can be used to train a context-free language model. Many interesting questions probing linguistic structure can be reduced to special cases of this general task. For example, the task can be used to model a language of balanced parentheses. The configuration for the parentheses task is final_dyck_config.

Evaluation tasks

We also have a class for evaluation tasks. These are tasks where output i can be succintly expressed as some function of inputs 0, .., i. Some applications of this are evaluation of parity and reverse polish boolean formulae.

Real datasets

The data folder contains several real datasets that the stack can be trained on. We should implement a task for reading in these datasets.

Owner
Will Merrill
NLP x linguistics x theory w/ AllenNLP.
Will Merrill
A synthetic texture-invariant dataset for object detection of UAVs

A synthetic dataset for object detection of UAVs This repository contains a synthetic datasets accompanying the paper Sim2Air - Synthetic aerial datas

LARICS Lab 10 Aug 13, 2022
A Runtime method overload decorator which should behave like a compiled language

strongtyping-pyoverload A Runtime method overload decorator which should behave like a compiled language there is a override decorator from typing whi

20 Oct 31, 2022
An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

gym-idsgame An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym gym-idsgame is a reinforcement learning environment for simulating at

Kim Hammar 29 Dec 03, 2022
RL and distillation in CARLA using a factorized world model

World on Rails Learning to drive from a world on rails Dian Chen, Vladlen Koltun, Philipp Krähenbühl, arXiv techical report (arXiv 2105.00636) This re

Dian Chen 131 Dec 16, 2022
The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast".

ReCo - Regional Contrast This repository contains the source code of ReCo and baselines from the paper, Bootstrapping Semantic Segmentation with Regio

Shikun Liu 128 Dec 30, 2022
This is an early in-development version of training CLIP models with hivemind.

A transformer that does not hog your GPU memory This is an early in-development codebase: if you want a stable and documented hivemind codebase, look

<a href=[email protected]"> 4 Nov 06, 2022
Tgbox-bench - Simple TGBOX upload speed benchmark

TGBOX Benchmark This script will benchmark upload speed to TGBOX storage. Build

Non 1 Jan 09, 2022
Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

2 Feb 03, 2022
Code for the paper: Hierarchical Reinforcement Learning With Timed Subgoals, published at NeurIPS 2021

Hierarchical reinforcement learning with Timed Subgoals (HiTS) This repository contains code for reproducing experiments from our paper "Hierarchical

Autonomous Learning Group 21 Dec 03, 2022
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat

AIStream 1.2k Jan 04, 2023
Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Towards Toxic and Narcotic Medication Detection with Rotated Object Detector Introduction This is the source code of article: Towards Toxic and Narcot

Woody. Wang 3 Oct 29, 2022
Implementation of FitVid video prediction model in JAX/Flax.

FitVid Video Prediction Model Implementation of FitVid video prediction model in JAX/Flax. If you find this code useful, please cite it in your paper:

Google Research 62 Nov 25, 2022
This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

International Business Machines 72 Jan 06, 2023
A scikit-learn compatible neural network library that wraps PyTorch

A scikit-learn compatible neural network library that wraps PyTorch. Resources Documentation Source Code Examples To see more elaborate examples, look

4.9k Jan 03, 2023
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It or

airctic 789 Dec 29, 2022
Namish Khanna 40 Oct 11, 2022
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Yulun Zhang 1.2k Dec 26, 2022
A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

Justin 1.1k Dec 24, 2022
Autonomous Robots Kalman Filters

Autonomous Robots Kalman Filters The Kalman Filter is an easy topic. However, ma

20 Jul 18, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 04, 2023