Implementation of the Point Transformer layer, in Pytorch

Overview

Point Transformer - Pytorch

Implementation of the Point Transformer self-attention layer, in Pytorch. The simple circuit above seemed to have allowed their group to outperform all previous methods in point cloud classification and segmentation.

Install

$ pip install point-transformer-pytorch

Usage

import torch
from point_transformer_pytorch import PointTransformerLayer

attn = PointTransformerLayer(
    dim = 128,
    pos_mlp_hidden_dim = 64,
    attn_mlp_hidden_mult = 4
)

x = torch.randn(1, 16, 128)
pos = torch.randn(1, 16, 3)

attn(x, pos) # (1, 16, 128)

Citations

@misc{zhao2020point,
    title={Point Transformer}, 
    author={Hengshuang Zhao and Li Jiang and Jiaya Jia and Philip Torr and Vladlen Koltun},
    year={2020},
    eprint={2012.09164},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Comments
  • Did You Falsify Your Experimental Results???

    Did You Falsify Your Experimental Results???

    No one can reproduce the performance reported in your original paper. Please post your pre-trained model or your original code. Otherwise, we must question your academic ethics!****

    opened by TruthIsEveryThing 1
  • Issues with my wrapper code

    Issues with my wrapper code

    I wrote some wrapper code to turn this layer into a full transformer and I can't seem to figure out what is going wrong. The following works:

    import torch
    from torch import nn, einsum
    import x_transformers
    from point_transformer_pytorch import PointTransformerLayer
    
    layer = PointTransformerLayer(
        dim = 7,
        pos_mlp_hidden_dim = 64,
        attn_mlp_hidden_mult = 4,
        num_neighbors = 16          # only the 16 nearest neighbors would be attended to for each point
    )
    
    feats = torch.randn(1, 5, 7)
    pos = torch.randn(1, 5, 3)
    mask = torch.ones(1, 5).bool()
    
    y = layer(feats, pos, mask = mask)
    

    However this doesn't work

    import torch
    from torch import nn, einsum
    import x_transformers
    from point_transformer_pytorch import PointTransformerLayer
    
    class PointTransformer(nn.Module):
        def __init__(self, feats, mask, neighbors = 16, layers=5, dimension=5):
            
            super().__init__()
            
            self.feats = feats
            self.mask = mask
            self.neighbors = neighbors
            
            self.layers = []
            
            for _ in range(layers):
                self.layers.append(PointTransformerLayer(
                    dim = dimension,
                    pos_mlp_hidden_dim = 64,
                    attn_mlp_hidden_mult = 4,
                    num_neighbors = self.neighbors
                ))
    
        def forward(self, pos):
            curr_pos = pos
            for layer in self.layers:
                print(curr_pos)
                curr_pos = layer(self.feats, pos, self.mask)
                print("----")
            return curr_pos
    
    model = PointTransformer(feats, mask)
    model(pos)
    

    The error I'm getting is mat1 and mat2 shapes cannot be multiplied (5x7 and 5x15)

    opened by StellaAthena 1
  • point clouds with different number of points

    point clouds with different number of points

    Great job! I have a question about the number of the points in the point cloud. Do you have any suggestion to deal with point clouds with different point. As I know, point cloud models are always applied in Shapenet which contains point clouds with 2048 points. So what can we do if the number of the point clouds is not constant?

    opened by 1999kevin 0
  • Scalar attention or vector attention in the multi-head variant

    Scalar attention or vector attention in the multi-head variant

    It seems that the implementation of the multi-head point transformer produces scalar attention scores for each head.

    https://github.com/lucidrains/point-transformer-pytorch/blob/99bc3958138d8c9d3b882e4ac50b1a18a86160fe/point_transformer_pytorch/multihead_point_transformer_pytorch.py#L62

    opened by ZikangZhou 2
  • The layer structure and mask

    The layer structure and mask

    Hi,

    Thanks for this contribution. In the implementation of attn_mlp the first linear layer increases the dimension. Is this a standard practice because I did not find any details about this in the paper. Also paper also does not describe use of mask, is this again some standard practice for attention layers?

    Thanks!!

    opened by ayushais 1
  • Invariant to cardinality?

    Invariant to cardinality?

    Dear Authors, In your paper you wrote: "The layer is invariant to permutation and cardinality and is thus inherently suited to point cloud processing."

    I do not understand this statement, because your PointTransformerLayer https://github.com/lucidrains/point-transformer-pytorch/blob/main/point_transformer_pytorch/point_transformer_pytorch.py#L31 requires the dim parameter in initialization. So it always expects dim elements in input. What if a point cloud has dim+1 points?

    Thank you in advance.

    opened by decadenza 0
  • Cost too much memory

    Cost too much memory

    I'm not sure whether I used the point-transformer correctly: I just implemented one block for training, and the data shape of (x, pos) in each gpu are both [16, 2048, 3], later I was informed that my gpu is running out of the memory(11.77 GB total capacity)

    opened by JLU-Neal 9
Releases(0.1.5)
Owner
Phil Wang
Working with Attention. It's all we need.
Phil Wang
A foreign language learning aid using a neural network to predict probability of translating foreign words

Langy Langy is a reading-focused foreign language learning aid orientated towards young children. Reading is an activity that every child knows. It is

Shona Lowden 6 Nov 17, 2021
U-Net: Convolutional Networks for Biomedical Image Segmentation

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras This tutorial shows how to use Keras library to build deep ne

Yihui He 401 Nov 21, 2022
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

Yu Meng 60 Dec 30, 2022
Companion code for the paper Theoretical characterization of uncertainty in high-dimensional linear classification

Companion code for the paper Theoretical characterization of uncertainty in high-dimensional linear classification Usage The required packages are lis

0 Feb 07, 2022
offical implement of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021

LifelongReID Offical implementation of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021 by Nan Pu, Wei Chen, Yu L

PeterPu 76 Dec 08, 2022
Lite-HRNet: A Lightweight High-Resolution Network

LiteHRNet Benchmark 🔥 🔥 Based on MMsegmentation 🔥 🔥 Cityscapes FCN resize concat config mIoU last mAcc last eval last mIoU best mAcc best eval bes

16 Dec 12, 2022
An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

DeepNER An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models. This repository contains complex Deep

Derrick 9 May 30, 2022
Convert Mission Planner (ArduCopter) Waypoint Missions to Litchi CSV Format to execute on DJI Drones

Mission Planner to Litchi Convert Mission Planner (ArduCopter) Waypoint Surveys to Litchi CSV Format to execute on DJI Drones Litchi doesn't support S

Yaros 24 Dec 09, 2022
Code for "Searching for Efficient Multi-Stage Vision Transformers"

Searching for Efficient Multi-Stage Vision Transformers This repository contains the official Pytorch implementation of "Searching for Efficient Multi

Yi-Lun Liao 62 Oct 25, 2022
Relaxed-machines - explorations in neuro-symbolic differentiable interpreters

Relaxed Machines Explorations in neuro-symbolic differentiable interpreters. Baby steps: inc_stop Libraries JAX Haiku Optax Resources Chapter 3 (∂4: A

Nada Amin 6 Feb 02, 2022
Implementation of the HMAX model of vision in PyTorch

PyTorch implementation of HMAX PyTorch implementation of the HMAX model that closely follows that of the MATLAB implementation of The Laboratory for C

Marijn van Vliet 52 Oct 13, 2022
How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Training a NN to 99% accuracy on MNIST in 0.76 seconds A quick study on how fast you can reach 99% accuracy on MNIST with a single laptop. Our answer

Tuomas Oikarinen 42 Dec 10, 2022
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021] Abstract Analyzing complex scenes with DNN is a challenging ta

Irene Yuan 24 Jun 27, 2022
Code accompanying "Dynamic Neural Relational Inference" from CVPR 2020

Code accompanying "Dynamic Neural Relational Inference" This codebase accompanies the paper "Dynamic Neural Relational Inference" from CVPR 2020. This

Colin Graber 48 Dec 23, 2022
PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Involution: Inverting the Inherence of Convolution for Visual Recognition Unofficial PyTorch reimplementation of the paper Involution: Inverting the I

Christoph Reich 100 Dec 01, 2022
Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

DeepCurrents | Webpage | Paper DeepCurrents: Learning Implicit Representations of Shapes with Boundaries David Palmer*, Dmitriy Smirnov*, Stephanie Wa

Dima Smirnov 36 Dec 08, 2022
Fast, general, and tested differentiable structured prediction in PyTorch

Fast, general, and tested differentiable structured prediction in PyTorch

HNLP 1.1k Dec 16, 2022
Minimal deep learning library written from scratch in Python, using NumPy/CuPy.

SmallPebble Project status: experimental, unstable. SmallPebble is a minimal/toy automatic differentiation/deep learning library written from scratch

Sidney Radcliffe 92 Dec 30, 2022
StyleSwin: Transformer-based GAN for High-resolution Image Generation

StyleSwin This repo is the official implementation of "StyleSwin: Transformer-based GAN for High-resolution Image Generation". By Bowen Zhang, Shuyang

Microsoft 349 Dec 28, 2022
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis Overview OpenABC-D is a large-scale labeled dataset generate

NYU Machine-Learning guided Design Automation (MLDA) 31 Nov 22, 2022