Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Last update: Dec 14, 2022

Overview

Differentiable Neural Computers and family, for Pytorch

Includes:

Differentiable Neural Computers (DNC)
Sparse Access Memory (SAM)
Sparse Differentiable Neural Computers (SDNC)

Install
- From source
Architecure
Usage
- DNC
  - Example usage
  - Debugging
- SDNC
  - Example usage
  - Debugging
- SAM
  - Example usage
  - Debugging
Tasks
Code Structure
General noteworthy stuff

This is an implementation of Differentiable Neural Computers, described in the paper Hybrid computing using a neural network with dynamic external memory, Graves et al. and Sparse DNCs (SDNCs) and Sparse Access Memory (SAM) described in Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes.

Install

pip install dnc

From source

git clone https://github.com/ixaxaar/pytorch-dnc
cd pytorch-dnc
pip install -r ./requirements.txt
pip install -e .

For using fully GPU based SDNCs or SAMs, install FAISS:

conda install faiss-gpu -c pytorch

pytest is required to run the test

Architecure

Usage

DNC

Constructor Parameters:

Following are the constructor parameters:

Argument	Default	Description
input_size	`None`	Size of the input vectors
hidden_size	`None`	Size of hidden units
rnn_type	`'lstm'`	Type of recurrent cells used in the controller
num_layers	`1`	Number of layers of recurrent units in the controller
num_hidden_layers	`2`	Number of hidden layers per layer of the controller
bias	`True`	Bias
batch_first	`True`	Whether data is fed batch first
dropout	`0`	Dropout between layers in the controller
bidirectional	`False`	If the controller is bidirectional (Not yet implemented
nr_cells	`5`	Number of memory cells
read_heads	`2`	Number of read heads
cell_size	`10`	Size of each memory cell
nonlinearity	`'tanh'`	If using 'rnn' as `rnn_type`, non-linearity of the RNNs
gpu_id	`-1`	ID of the GPU, -1 for CPU
independent_linears	`False`	Whether to use independent linear units to derive interface vector
share_memory	`True`	Whether to share memory between controller layers

Following are the forward pass parameters:

Argument	Default	Description
input	-	The input vector `(BTX)` or `(TBX)`
hidden	`(None,None,None)`	Hidden states `(controller hidden, memory hidden, read vectors)`
reset_experience	`False`	Whether to reset memory
pass_through_memory	`True`	Whether to pass through memory

Example usage

from dnc import DNC

rnn = DNC(
  input_size=64,
  hidden_size=128,
  rnn_type='lstm',
  num_layers=4,
  nr_cells=100,
  cell_size=32,
  read_heads=4,
  batch_first=True,
  gpu_id=0
)

(controller_hidden, memory, read_vectors) = (None, None, None)

output, (controller_hidden, memory, read_vectors) = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

Debugging

The debug option causes the network to return its memory hidden vectors (numpy ndarrays) for the first batch each forward step. These vectors can be analyzed or visualized, using visdom for example.

from dnc import DNC

rnn = DNC(
  input_size=64,
  hidden_size=128,
  rnn_type='lstm',
  num_layers=4,
  nr_cells=100,
  cell_size=32,
  read_heads=4,
  batch_first=True,
  gpu_id=0,
  debug=True
)

(controller_hidden, memory, read_vectors) = (None, None, None)

output, (controller_hidden, memory, read_vectors), debug_memory = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

Memory vectors returned by forward pass (np.ndarray):

Key	Y axis (dimensions)	X axis (dimensions)
`debug_memory['memory']`	layer * time	nr_cells * cell_size
`debug_memory['link_matrix']`	layer * time	nr_cells * nr_cells
`debug_memory['precedence']`	layer * time	nr_cells
`debug_memory['read_weights']`	layer * time	read_heads * nr_cells
`debug_memory['write_weights']`	layer * time	nr_cells
`debug_memory['usage_vector']`	layer * time	nr_cells

SDNC

Constructor Parameters:

Following are the constructor parameters:

Argument	Default	Description
input_size	`None`	Size of the input vectors
hidden_size	`None`	Size of hidden units
rnn_type	`'lstm'`	Type of recurrent cells used in the controller
num_layers	`1`	Number of layers of recurrent units in the controller
num_hidden_layers	`2`	Number of hidden layers per layer of the controller
bias	`True`	Bias
batch_first	`True`	Whether data is fed batch first
dropout	`0`	Dropout between layers in the controller
bidirectional	`False`	If the controller is bidirectional (Not yet implemented
nr_cells	`5000`	Number of memory cells
read_heads	`4`	Number of read heads
sparse_reads	`4`	Number of sparse memory reads per read head
temporal_reads	`4`	Number of temporal reads
cell_size	`10`	Size of each memory cell
nonlinearity	`'tanh'`	If using 'rnn' as `rnn_type`, non-linearity of the RNNs
gpu_id	`-1`	ID of the GPU, -1 for CPU
independent_linears	`False`	Whether to use independent linear units to derive interface vector
share_memory	`True`	Whether to share memory between controller layers

Following are the forward pass parameters:

Argument	Default	Description
input	-	The input vector `(BTX)` or `(TBX)`
hidden	`(None,None,None)`	Hidden states `(controller hidden, memory hidden, read vectors)`
reset_experience	`False`	Whether to reset memory
pass_through_memory	`True`	Whether to pass through memory

Example usage

from dnc import SDNC

rnn = SDNC(
  input_size=64,
  hidden_size=128,
  rnn_type='lstm',
  num_layers=4,
  nr_cells=100,
  cell_size=32,
  read_heads=4,
  sparse_reads=4,
  batch_first=True,
  gpu_id=0
)

(controller_hidden, memory, read_vectors) = (None, None, None)

output, (controller_hidden, memory, read_vectors) = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

Debugging

from dnc import SDNC

rnn = SDNC(
  input_size=64,
  hidden_size=128,
  rnn_type='lstm',
  num_layers=4,
  nr_cells=100,
  cell_size=32,
  read_heads=4,
  batch_first=True,
  sparse_reads=4,
  temporal_reads=4,
  gpu_id=0,
  debug=True
)

(controller_hidden, memory, read_vectors) = (None, None, None)

output, (controller_hidden, memory, read_vectors), debug_memory = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

Memory vectors returned by forward pass (np.ndarray):

Key	Y axis (dimensions)	X axis (dimensions)
`debug_memory['memory']`	layer * time	nr_cells * cell_size
`debug_memory['visible_memory']`	layer * time	sparse_reads+2temporal_reads+1 nr_cells
`debug_memory['read_positions']`	layer * time	sparse_reads+2*temporal_reads+1
`debug_memory['link_matrix']`	layer * time	sparse_reads+2temporal_reads+1 * sparse_reads+2temporal_reads+1
`debug_memory['rev_link_matrix']`	layer * time	sparse_reads+2temporal_reads+1 * sparse_reads+2temporal_reads+1
`debug_memory['precedence']`	layer * time	nr_cells
`debug_memory['read_weights']`	layer * time	read_heads * nr_cells
`debug_memory['write_weights']`	layer * time	nr_cells
`debug_memory['usage']`	layer * time	nr_cells

SAM

Constructor Parameters:

Following are the constructor parameters:

Argument	Default	Description
input_size	`None`	Size of the input vectors
hidden_size	`None`	Size of hidden units
rnn_type	`'lstm'`	Type of recurrent cells used in the controller
num_layers	`1`	Number of layers of recurrent units in the controller
num_hidden_layers	`2`	Number of hidden layers per layer of the controller
bias	`True`	Bias
batch_first	`True`	Whether data is fed batch first
dropout	`0`	Dropout between layers in the controller
bidirectional	`False`	If the controller is bidirectional (Not yet implemented
nr_cells	`5000`	Number of memory cells
read_heads	`4`	Number of read heads
sparse_reads	`4`	Number of sparse memory reads per read head
cell_size	`10`	Size of each memory cell
nonlinearity	`'tanh'`	If using 'rnn' as `rnn_type`, non-linearity of the RNNs
gpu_id	`-1`	ID of the GPU, -1 for CPU
independent_linears	`False`	Whether to use independent linear units to derive interface vector
share_memory	`True`	Whether to share memory between controller layers

Following are the forward pass parameters:

Argument	Default	Description
input	-	The input vector `(BTX)` or `(TBX)`
hidden	`(None,None,None)`	Hidden states `(controller hidden, memory hidden, read vectors)`
reset_experience	`False`	Whether to reset memory
pass_through_memory	`True`	Whether to pass through memory

Example usage

from dnc import SAM

rnn = SAM(
  input_size=64,
  hidden_size=128,
  rnn_type='lstm',
  num_layers=4,
  nr_cells=100,
  cell_size=32,
  read_heads=4,
  sparse_reads=4,
  batch_first=True,
  gpu_id=0
)

(controller_hidden, memory, read_vectors) = (None, None, None)

output, (controller_hidden, memory, read_vectors) = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

Debugging

from dnc import SAM

rnn = SAM(
  input_size=64,
  hidden_size=128,
  rnn_type='lstm',
  num_layers=4,
  nr_cells=100,
  cell_size=32,
  read_heads=4,
  batch_first=True,
  sparse_reads=4,
  gpu_id=0,
  debug=True
)

(controller_hidden, memory, read_vectors) = (None, None, None)

output, (controller_hidden, memory, read_vectors), debug_memory = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

Memory vectors returned by forward pass (np.ndarray):

Key	Y axis (dimensions)	X axis (dimensions)
`debug_memory['memory']`	layer * time	nr_cells * cell_size
`debug_memory['visible_memory']`	layer * time	sparse_reads+2temporal_reads+1 nr_cells
`debug_memory['read_positions']`	layer * time	sparse_reads+2*temporal_reads+1
`debug_memory['read_weights']`	layer * time	read_heads * nr_cells
`debug_memory['write_weights']`	layer * time	nr_cells
`debug_memory['usage']`	layer * time	nr_cells

Tasks

Copy task (with curriculum and generalization)

The copy task, as descibed in the original paper, is included in the repo.

From the project root:

python ./tasks/copy_task.py -cuda 0 -optim rmsprop -batch_size 32 -mem_slot 64 # (like original implementation)

python ./tasks/copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 32 -batch_size 1000 -optim adam -sequence_max_length 8 # (faster convergence)

For SDNCs:
python ./tasks/copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -memory_type sdnc -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 100 -mem_size 10  -read_heads 1 -sparse_reads 10 -batch_size 20 -optim adam -sequence_max_length 10

and for curriculum learning for SDNCs:
python ./tasks/copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -memory_type sdnc -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 100 -mem_size 10  -read_heads 1 -sparse_reads 4 -temporal_reads 4 -batch_size 20 -optim adam -sequence_max_length 4 -curriculum_increment 2 -curriculum_freq 10000

For the full set of options, see:

python ./tasks/copy_task.py --help

The copy task can be used to debug memory using Visdom.

Additional step required:

pip install visdom
python -m visdom.server

Open http://localhost:8097/ on your browser, and execute the copy task:

python ./tasks/copy_task.py -cuda 0

The visdom dashboard shows memory as a heatmap for batch 0 every -summarize_freq iteration:

Generalizing Addition task

The adding task is as described in this github pull request. This task

creates one-hot vectors of size input_size, each representing a number
feeds a sentence of them to a network
the output of which is added to get the sum of the decoded outputs

The task first trains the network for sentences of size ~100, and then tests if the network genetalizes for lengths ~1000.

python ./tasks/adding_task.py -cuda 0 -lr 0.0001 -rnn_type lstm -memory_type sam -nlayer 1 -nhlayer 1 -nhid 100 -dropout 0 -mem_slot 1000 -mem_size 32 -read_heads 1 -sparse_reads 4 -batch_size 20 -optim rmsprop -input_size 3 -sequence_max_length 100

Generalizing Argmax task

The second adding task is similar to the first one, except that the network's output at the last time step is expected to be the argmax of the input.

python ./tasks/argmax_task.py -cuda 0 -lr 0.0001 -rnn_type lstm -memory_type dnc -nlayer 1 -nhlayer 1 -nhid 100 -dropout 0 -mem_slot 100 -mem_size 10 -read_heads 2 -batch_size 1 -optim rmsprop -sequence_max_length 15 -input_size 10 -iterations 10000

Code Structure

DNCs:

dnc/dnc.py - Controller code.
dnc/memory.py - Memory module.

SDNCs:

dnc/sdnc.py - Controller code, inherits dnc.py.
dnc/sparse_temporal_memory.py - Memory module.
dnc/flann_index.py - Memory index using kNN.

SAMs:

dnc/sam.py - Controller code, inherits dnc.py.
dnc/sparse_memory.py - Memory module.
dnc/flann_index.py - Memory index using kNN.

Tests:

All tests are in ./tests folder.

General noteworthy stuff

SDNCs use the FLANN approximate nearest neigbhour library, with its python binding pyflann3 and FAISS.

FLANN can be installed either from pip (automatically as a dependency), or from source (e.g. for multithreading via OpenMP):

# install openmp first: e.g. `sudo pacman -S openmp` for Arch.
git clone git://github.com/mariusmuja/flann.git
cd flann
mkdir build
cd build
cmake ..
make -j 4
sudo make install

FAISS can be installed using:

conda install faiss-gpu -c pytorch

FAISS is much faster, has a GPU implementation and is interoperable with pytorch tensors. We try to use FAISS by default, in absence of which we fall back to FLANN.

nans in the gradients are common, try with different batch sizes

Repos referred to for creation of this repo:

Comments

copy_task.py sample fails.
testing with command line:

python copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 32 -batch_size 1000 -optim adam -sequence_max_length 8 -iterations 100

I get multiple errors when it finishes, first on the generate_data call which has undefined parameters:

input_data, target_output, loss_weights = generate_data(random_length, input_size)

NameError: name 'input_size' is not defined

And then after fixing that I get: output = output[:, -1, :].sum().data.cpu().numpy()[0] IndexError: too many indices for array

Looks like that bit of code hasn't been used. I have tried to fix it but I'm unclear of the solution for the second issue as I'm new to pytorch, thanks in advance for any fixes.

ChrisP.
opened by chrispugmire 3
Problem of the Softmax on Read Mode

https://github.com/ixaxaar/pytorch-dnc/blob/1db78511fe5622ade1c554d265a5d9d729c8801d/dnc/memory.py#L235

Should the softmax be applied on the last dimension? (i.e. the dimension of the read mode)

Currently, each read mode would always return 1 if the model has only one read head.

opened by yat011 3
Question about the running speed of Pyflann and Faiss for the SAM model

I can't normally install Faiss environment because of certain force majeure, so I wonder how using Faiss-gpu or Pyflann will influence the actual training speed of the SAM model. Let's say for example, in the copy task, how are the actual epoch times when using these two methods? Can you give me a rough reference?

opened by zoharli 3
PySide dependency error

I followed your instructions to run and visualize copy_task.py in visdom, but am encountering some dependency errors. I am using Python 3.6.

First error when running python ./tasks/copy_task.py -cuda 0:

File "C:\Users\alexander.d.payne\AppData\Local\Programs\Python\Python36\lib\site-packages\pyflann\bindings\flann_ctypes.py", line 171, in <module> raise ImportError('Cannot load dynamic library. Did you compile FLANN?') ImportError: Cannot load dynamic library. Did you compile FLANN?

pip installed pyflann, and ran again:

File "C:\Users\alexander.d.payne\AppData\Local\Programs\Python\Python36\lib\site-packages\pyflann\__init__.py", line 27, in <module> from index import * ModuleNotFoundError: No module named 'index'

pip installed index, and ran again:

C:\Users\alexander.d.payne\Documents\pytorch-dnc-master>pip install index Collecting index Downloading ...files.pythonhosted.org/packages/7f/59/65da893e04f3eb49f73e6770e0999c57230669a484b14ca574154e9b75d3/index-0.2.tar.gz Collecting PySide (from index) Downloading ...files.pythonhosted.org/packages/36/ac/ca31db6f2225844d37a41b10615c3d371587677efd074db29855e7035de6/PySide-1.2.4.tar.gz (9.3MB) 100% |████████████████████████████████| 9.3MB 3.2MB/s Complete output from command python setup.py egg_info: only these python versions are supported: [(2, 6), (2, 7), (3, 2), (3, 3), (3, 4)] Command "python setup.py egg_info" failed with error code 1 in C:\Users\ALEXAN~1.PAY\AppData\Local\Temp\pip-install-aph21f59\PySide\

Even if I switched to Python 3.4 I don't see a torch option for that version at their website https://pytorch.org/, is there anyway around this? Thank you.

opened by apayne19 3

Issues when using pytorch 0.4

I get errors when trying to run both DNC and SDNC examples with pytorch 0.4.0. For DNC:

(py36) [[email protected] test]$ python test_dnc.py 
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py:118: UserWarning: nn.init.orthogonal is now deprecated in favor of nn.init.orthogonal_.
  orthogonal(self.output.weight)
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py:133: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
  xavier_uniform(h)
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/util.py:95: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  soft_max_2d = F.softmax(input_2d)
Traceback (most recent call last):
  File "test_dnc.py", line 19, in <module>
    rnn(torch.randn(10, 4, 64).cuda(), (controller_hidden, memory, read_vectors), True)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py", line 265, in forward
    inputs = [self.output(i) for i in inputs]
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py", line 265, in <listcomp>
    inputs = [self.output(i) for i in inputs]
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/functional.py", line 992, in linear
    return torch.addmm(bias, input, weight.t())
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'mat1'

For SDNC:

(py36) [[email protected] test]$ python test_dnc.py 
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py:118: UserWarning: nn.init.orthogonal is now deprecated in favor of nn.init.orthogonal_.
  orthogonal(self.output.weight)
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/sparse_temporal_memory.py:65: UserWarning: nn.init.orthogonal is now deprecated in favor of nn.init.orthogonal_.
  T.nn.init.orthogonal(self.interface_weights.weight)
/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py:133: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
  xavier_uniform(h)
Traceback (most recent call last):
  File "test_dnc.py", line 20, in <module>
    rnn(torch.randn(10, 4, 64).cuda(), (controller_hidden, memory, read_vectors), True)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py", line 219, in forward
    controller_hidden, mem_hidden, last_read = self._init_hidden(hx, batch_size, reset_experience)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/dnc.py", line 144, in _init_hidden
    mhx = self.memories[0].reset(batch_size, erase=reset_experience)
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/sparse_temporal_memory.py", line 126, in reset
    'read_positions': cuda(T.arange(0, c).expand(b, c), gpu_id=self.gpu_id).long()
  File "/amd/home/mammadli/tools/anaconda2/envs/py36/lib/python3.6/site-packages/dnc/util.py", line 30, in cuda
    return var(x.pin_memory(), requires_grad=grad).cuda(gpu_id, async=True)
RuntimeError: invalid argument 3: Source tensor must be contiguous at /opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/generic/THCTensorCopy.c:114

opened by Rahim16 3

TypeError: cat received an invalid combination of arguments - got (list, int), but expected one of:

I'm trying to run your example for SAM, but I'm running into the following error:

Traceback (most recent call last):
  File "dnc_test.py", line 20, in <module>
    rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)
  File "/home/xxx/miniconda3/envs/my_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xxx/git_repos/pytorch-dnc/dnc/dnc.py", line 222, in forward
    inputs = [T.cat([input[:, x, :], last_read], 1) for x in range(max_length)]
  File "/home/xxx/git_repos/pytorch-dnc/dnc/dnc.py", line 222, in <listcomp>
    inputs = [T.cat([input[:, x, :], last_read], 1) for x in range(max_length)]
TypeError: cat received an invalid combination of arguments - got (list, int), but expected one of:
 * (sequence[torch.cuda.FloatTensor] seq)
 * (sequence[torch.cuda.FloatTensor] seq, int dim)
      didn't match because some of the arguments have invalid types: (list, int)

I tried loading the input tensor onto the gpu with .cuda() and transforming last_read to a Tensor with .data, but that led to other issues.

There's also a typo in your documentation:

output, (controller_hidden, memory, read_vectors) = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors, reset_experience=True))

should be

output, (controller_hidden, memory, read_vectors) = \
  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)

opened by kierkegaard13 3

bug in cosine distance?

I believe there's a bug in the function $\theta$ from dnc.util for computing cosine distance. First, I think you are trying to compute cosine similarity, not distance (sim = 1 - dist). Second, I think the current function implements neither cosine similarity nor distance. Here's a modified variant that returns the correct output for cosine similarity.

def bcos(a, b, normBy=2):
    """Batchwise cosine similarity
    
    Arguments:
        a: 3D tensor of shape [b,m,w]
        b: 3D tensor of shape [b,r,w]
    Returns:
        cos: batchwise cosine similarity of shape [b,r,m]
    """
    dot = torch.bmm(a, b.transpose(1,2)) # [b,m,w] @ [b,w,r] -> [b,m,r]
    a_norm = torch.norm(a, normBy, dim=2).unsqueeze(2) # [b,m,1]
    b_norm = torch.norm(b, normBy, dim=2).unsqueeze(1) # [b,1,r]
    cos = dot / (a_norm * b_norm) # [b,m,r]

    return cos.transpose(1,2)  # [b,r,m]

opened by rfeinman 2

reset_experience meaning

Looking at the code it isn't clear to me what reset_experience does. Since the memory is an argument to the forward, is setting reset_experience equivalent to calling dnc(controller_state, None, read_vectors)?

If this is the case, then when using dnc with inputs of shape [batch, time, feature] , reset_experience will clear the memory between batches. If we continuously call dnc with inputs of shape [batch, 1, feature], then we do not want to reset. Is this correct?

opened by smorad 1
$fix bug in function \theta for batchwise cosine similarity$

fix bug in function \theta for batchwise cosine similarity

I wasn't able to keep the options "dimA" and "dimB" for non-default similarity dimensions, but I don't see those used anywhere in the repo.

opened by rfeinman 1
Error when running copy_task.py

I executed copy task as following the cmd line in README. python copy_task.py -cuda 0 -optim rmsprop -batch_size 32 -mem_slot 64 Having NameError at line 366, name 'input_size' is not defined. I can change the input_size to args.input_size, but I think there are additional problems other than that. Function generate_data at requires 3 arguments, but only 2 are given. Is generate_data in line 366 a different function defined in line 78?

https://github.com/ixaxaar/pytorch-dnc/blob/016b541223bf801f3f3a617fa3942cc12ef71be9/tasks/copy_task.py#L78 https://github.com/ixaxaar/pytorch-dnc/blob/016b541223bf801f3f3a617fa3942cc12ef71be9/tasks/copy_task.py#L366

Thank you,

opened by jin8 1

When running adding task -- ModuleNotFoundError: No module named 'index'

  File "tasks/adding_task.py", line 25, in <module>
    from dnc.dnc import DNC
  File "/home/vanangamudi/projects/cloned/pytorch-dnc/dnc/__init__.py", line 5, in <module>
    from .sdnc import SDNC
  File "/home/vanangamudi/projects/cloned/pytorch-dnc/dnc/sdnc.py", line 15, in <module>
    from .sparse_temporal_memory import SparseTemporalMemory
  File "/home/vanangamudi/projects/cloned/pytorch-dnc/dnc/sparse_temporal_memory.py", line 11, in <module>
    from .flann_index import FLANNIndex
  File "/home/vanangamudi/projects/cloned/pytorch-dnc/dnc/flann_index.py", line 9, in <module>
    from pyflann import *
  File "/home/vanangamudi/env/torch/lib/python3.6/site-packages/pyflann/__init__.py", line 27, in <module>
    from index import *
ModuleNotFoundError: No module named 'index'

opened by vanangamudi 1

pytorch LTS support (1.8.2) or stable (1.11.1)

Hello!

I was wondering if someone can confirm that this package still runs under pytroch lts or current stable (1.11.1)?

I'm getting a curious error. Note this is for CPU training. Maybe someone can confirm this is only broken under cpu training.

Thank you!

`03:44 $ python ./tasks/adding_task.py -lr 0.0001 -rnn_type lstm -memory_type sam -nlayer 1 -nhlayer 1 -nhid 100 -dropout 0 -mem_slot 1000 -mem_size 32 -read_heads 1 -sparse_reads 4 -batch_size 20 -optim rmsprop -input_size 3 -sequence_max_length 100 Namespace(batch_size=20, check_freq=100, clip=50, cuda=-1, dropout=0.0, input_size=3, iterations=2000, lr=0.0001, mem_size=32, mem_slot=1000, memory_type='sam', nhid=100, nhlayer=1, nlayer=1, optim='rmsprop', read_heads=1, rnn_type='lstm', sequence_max_length=100, sparse_reads=4, summarize_freq=100, temporal_reads=2, visdom=False) Using CPU.

SAM(3, 100, num_hidden_layers=1, nr_cells=1000, read_heads=1, cell_size=32) SAM( (lstm_layer_0): LSTM(35, 100, batch_first=True) (rnn_layer_memory_shared): SparseMemory( (interface_weights): Linear(in_features=100, out_features=70, bias=True) ) (output): Linear(in_features=132, out_features=3, bias=True) )

Iteration 0/2000 Falling back to FLANN (CPU). For using faster, GPU based indexes, install FAISS: "conda install faiss-gpu -c pytorch" Traceback (most recent call last): File "./tasks/adding_task.py", line 222, in loss.backward() File "/home/eziegenbalg/.conda/envs/default/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/eziegenbalg/.conda/envs/default/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward Variable._execution_engine.run_backward( RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1, 1000]], which is output 0 of AsStridedBackward, is at version 70; expected version 69 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

^C (default) ✘-INT ~/pytorch-dnc [master|✚ 2] 03:45 $ `

opened by ziegenbalg 5
Question about allocation weighting
The paper describes the allocation weighting vector as:

$a_t[\phi_t[j]] = (1 - u_t[\phi_t[j]]) \prod u_t[\psi_t[j]] $

In your part of the code where you calculate the right-part product you do this:

v = var(sorted_usage.data.new(batch_size, 1).fill_(1)) cat_sorted_usage = T.cat((v, sorted_usage), 1) prod_sorted_usage = T.cumprod(cat_sorted_usage, 1)[:, :-1]

Why do you create the var "v", which contains "ones", and concatenate it? This does not seem the same as in the paper.

Thanks, Peter
opened by PeterDeWachter1998 0
A question about memory initialization.

Hi,

I am a bit confused about how we save memory states in DNC. To be more specific, at the starting point of training, we have to initialize the memory with no doubt (fill all 0s in code). Having finished the training process, I think the memory values should be saved for testing usages. But it turns out that you reset the memory hidden states to be 0s AGAIN! (just as the erase part of dnc/memory.py,Line 69-75).

Could you please give me some explanations about this? Thank you in advance! Really need your help.

opened by LiUzHiAn 1
batch_first argument doesn’t work

Hi, I just want to let you know that with current implementation (file dnc.py, line 76:86), the batch_first will always be True. It is trivial but sometime troublesome. Have a nice day.

opened by Trungmaster5 1
Unresolved reference 'output'

https://github.com/ixaxaar/pytorch-dnc/blob/016b541223bf801f3f3a617fa3942cc12ef71be9/dnc/dnc.py#L270

The variable named 'output' (as can be seen above) is an unresolved reference. Kindly, please fix it.

opened by denizetkar 1

Releases(1.0.1)

1.0.1(Apr 5, 2019)
Changelog

1.0.1

Full Changelog

Closed issues:

When running adding task -- ModuleNotFoundError: No module named 'index' #39

SyntaxError #36

PySide dependency error #33

Issues when using pytorch 0.4 #31

TypeError: cat received an invalid combination of arguments - got (list, int), but expected one of: #29

Merged pull requests:

Fixes #36 and #39 #42 (ixaxaar)

Source code(tar.gz)
Source code(zip)
1.0.0(Apr 5, 2019)
Changelog

1.0.0

Full Changelog

Closed issues:

Question about the running speed of Pyflann and Faiss for the SAM model #40

SyntaxError #37

Values in hidden become nan #35

PySide dependency error #33

faiss error #32

Issues when using pytorch 0.4 #31

TypeError: cat received an invalid combination of arguments - got (list, int), but expected one of: #29

Merged pull requests:

Port to pytorch 1.x #41 (ixaxaar)

fix parens in example usage and gpu usage for SAM #30 (kierkegaard13)

Source code(tar.gz)
Source code(zip)
0.0.9(Apr 28, 2018)
Changelog

0.0.9 (2018-04-23)

Full Changelog

Fixed bugs:

Use usage vector to determine least recently used memory #26

Store entire memory after memory limit is reached #24

Merged pull requests:

memory.py: fix indexing for read_modes transform #28 (jbinas)

Bugfixes #27 (ixaxaar)

Source code(tar.gz)
Source code(zip)
0.0.7(Dec 20, 2017)
Changelog

Implemented enhancements:

GPU kNNs #21

Implement temporal addressing for SDNCs #18

Feature: Sparse Access Memory #4

SAMs #22 (ixaxaar)

Temporal links for SDNC #19 (ixaxaar)

SDNC #16 (ixaxaar)

Merged pull requests:

Add more tasks #23 (ixaxaar)

Scale interface vectors, dynamic memory pass #17 (ixaxaar)

Update README.md #14 (MaxwellRebo)

Source code(tar.gz)
Source code(zip)
0.0.6(Nov 12, 2017)
Change Log

Implemented enhancements:

Re-write allocation vector code, use pytorch's cumprod #13

Fixed bugs:

Stacked DNCs forward pass wrong #12

Merged pull requests:

Temporal debugging of memory #11 (ixaxaar)

Source code(tar.gz)
Source code(zip)
0.5.0(Nov 1, 2017)
Change Log

Implemented enhancements:

Multiple hidden layers per controller layer #7

Vizdom integration and fix cumprod bug #5 #6 (ixaxaar)

Fixed bugs:

Use shifted cumprods, emulate tensorflow's cumprod with exclusive=True #5

Vizdom integration and fix cumprod bug \#5 #6 (ixaxaar)

Closed issues:

Write unit tests #8

broken links #3

Merged pull requests:

Test travis build #10 (ixaxaar)

Implement Hidden layers, small enhancements, cleanups #9 (ixaxaar)

Source code(tar.gz)
Source code(zip)
0.0.3(Oct 27, 2017)

Changelog:

#1 Fix size issue for GRU and vanilla RNN #2 Implementation of Dropout for controller
Source code(tar.gz)
Source code(zip)
0.0.2(Oct 26, 2017)

Tested on https://github.com/ixaxaar/awd-dnc-lm.
Source code(tar.gz)
Source code(zip)
v0.0.1(Oct 26, 2017)

Initial release on pypi.
Source code(tar.gz)
Source code(zip)

Owner

ixaxaar

GitHub Repository

Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

7 Nov 07, 2022

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

ORB-SLAM2 Authors: Raul Mur-Artal, Juan D. Tardos, J. M. M. Montiel and Dorian Galvez-Lopez (DBoW2) 13 Jan 2017: OpenCV 3 and Eigen 3.3 are now suppor

7.8k Dec 30, 2022

GDSC-ML Team Interview Task

GDSC-ML-Team---Interview-Task Task 1 : Clean or Messy room In this task we have to classify the given test images as clean or messy. - Link for datase

1 Jan 19, 2022

pyspark🍒🥭 is delicious，just eat it!😋😋

如何用10天吃掉pyspark？ 🔥 🔥 《10天吃掉那只pyspark》 🚀

578 Dec 30, 2022

Deep Learning Slide Captcha

滑动验证码深度学习识别本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口，基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。只需要几百张缺口标注图片即可训练出精度高的识别模型，识别效果样例：克隆项目运行命令： git cl

55 Jan 02, 2023

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun

6 Jan 12, 2022

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i

349 Dec 29, 2022

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 07, 2023

Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

Punctuation Restoration using Transformer Models This repository contins official implementation of the paper Punctuation Restoration using Transforme

142 Jan 01, 2023

✨✨✨An awesome open source toolbox for stereo matching.

OpenStereo This is an awesome open source toolbox for stereo matching. Supported Methods: BM SGM(T-PAMI'07) GCNet(ICCV'17) PSMNet(CVPR'18) StereoNet(E

6 Nov 04, 2022

Repo for EchoVPR: Echo State Networks for Visual Place Recognition

EchoVPR Repo for EchoVPR: Echo State Networks for Visual Place Recognition Currently under development Dirs: data: pre-collected hidden representation

4 Oct 04, 2022

Keras like implementation of Deep Learning architectures from scratch using numpy.

Mini-Keras Keras like implementation of Deep Learning architectures from scratch using numpy. How to contribute? The project contains implementations

5 Oct 10, 2021

deep-prae

Deep Probabilistic Accelerated Evaluation (Deep-PrAE) Our work presents an efficient rare event simulation methodology for black box autonomy using Im

4 Apr 17, 2021

Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"

Photo-Realistic-Super-Resoluton Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network" [Paper]

199 Dec 01, 2022

GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection

GUPNet This is the official implementation of "Geometry Uncertainty Projection Network for Monocular 3D Object Detection". citation If you find our wo

103 Dec 28, 2022

Deep motion transfer

animation-with-keypoint-mask Paper The right most square is the final result. Softmax mask (circles): \ Heatmap mask: \ conda env create -f environmen

9 Nov 01, 2022

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

2 Nov 15, 2021

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Related tags

Overview

Differentiable Neural Computers and family, for Pytorch

Install

From source

Architecure

Usage

DNC

Example usage

Debugging

SDNC

Example usage

Debugging

SAM

Example usage

Debugging

Tasks

Copy task (with curriculum and generalization)

Generalizing Addition task

Generalizing Argmax task

Code Structure

General noteworthy stuff

Comments

I followed your instructions to run and visualize copy_task.py in visdom, but am encountering some dependency errors. I am using Python 3.6.

First error when running python ./tasks/copy_task.py -cuda 0:

pip installed pyflann, and ran again:

pip installed index, and ran again:

Even if I switched to Python 3.4 I don't see a torch option for that version at their website https://pytorch.org/, is there anyway around this? Thank you.

Releases(1.0.1)

1.0.1(Apr 5, 2019)

Changelog

1.0.0(Apr 5, 2019)

Changelog

0.0.9(Apr 28, 2018)

Changelog

0.0.9 (2018-04-23)

0.0.7(Dec 20, 2017)

Changelog

0.0.6(Nov 12, 2017)

Change Log

0.5.0(Nov 1, 2017)

Change Log

0.0.3(Oct 27, 2017)

0.0.2(Oct 26, 2017)

v0.0.1(Oct 26, 2017)

Owner

ixaxaar

Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

GDSC-ML Team Interview Task

pyspark🍒🥭 is delicious，just eat it!😋😋

Deep Learning Slide Captcha

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

✨✨✨An awesome open source toolbox for stereo matching.

Repo for EchoVPR: Echo State Networks for Visual Place Recognition

Keras like implementation of Deep Learning architectures from scratch using numpy.

deep-prae

Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"

GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection

Deep motion transfer

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

A program to recognize fruits on pictures or videos using yolov5

CausaLM: Causal Model Explanation Through Counterfactual Language Models

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)