Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Related tags

Deep LearningAXform
Overview

Attention-based Transformation from Latent Features to Point Clouds

This repository contains a PyTorch implementation of the paper:

Attention-based Transformation from Latent Features to Point Clouds
Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin
AAAI 2022

Introduction

In point cloud generation and completion, previous methods for transforming latent features to point clouds are generally based on fully connected layers (FC-based) or folding operations (Folding-based). However, point clouds generated by FC-based methods are usually troubled by outliers and rough surfaces. For folding-based methods, their data flow is large, convergence speed is slow, and they are also hard to handle the generation of non-smooth surfaces. In this work, we propose AXform, an attention-based method to transform latent features to point clouds. AXform first generates points in an interim space, using a fully connected layer. These interim points are then aggregated to generate the target point cloud. AXform takes both parameter sharing and data flow into account, which makes it has fewer outliers, fewer network parameters, and a faster convergence speed. The points generated by AXform do not have the strong 2-manifold constraint, which improves the generation of non-smooth surfaces. When AXform is expanded to multiple branches for local generations, the centripetal constraint makes it has properties of self-clustering and space consistency, which further enables unsupervised semantic segmentation. We also adopt this scheme and design AXformNet for point cloud completion. Considerable experiments on different datasets show that our methods achieve state-of-the-art results.

Dependencies

  • Python 3.6
  • CUDA 10.0
  • G++ or GCC 7.5
  • PyTorch. Codes are tested with version 1.6.0
  • (Optional) Visdom for visualization of the training process

Install all the following tools based on CUDA.

cd utils/furthestPointSampling
python3 setup.py install

# https://github.com/stevenygd/PointFlow/tree/master/metrics
cd utils/metrics/pytorch_structural_losses
make

# https://github.com/sshaoshuai/Pointnet2.PyTorch
cd utils/Pointnet2.PyTorch/pointnet2
python3 setup.py install

# https://github.com/daerduoCarey/PyTorchEMD
cd utils/PyTorchEMD
python3 setup.py install

# not used
cd utils/randPartial
python3 setup.py install

Datasets

PCN dataset (Google Drive) are used for point cloud completion.

ShapeNetCore.v2.PC2048 (Google Drive) are used for the other tasks. The point clouds are uniformly sampled from the meshes in ShapeNetCore dataset (version 2). All the point clouds are centered and scaled to [-0.5, 0.5]. We follow the official split. The sample code based on PyTorch3D can be found in utils/sample_pytorch3d.py.

Please download them to the data directory.

Training

All the arguments, e.g. gpu_ids, mode, method, hparas, num_branch, class_choice, visual, can be adjusted before training. For example:

# axform, airplane category, 16 branches
python3 axform.py --mode train --num_branch 16 --class_choice ['airplane']

# fc-based, car category
python3 models/fc_folding.py --mode train --method fc-based --class_choice ['car']

# l-gan, airplane category, not use axform
python3 models/latent_3d_points/l-gan.py --mode train --method original --class_choice ['airplane'] --ae_ckpt_path path_to_ckpt_autoencoder.pth

# axformnet, all categories, integrated
python3 axformnet.py --mode train --method integrated --class_choice None

Pre-trained models

Here we provide pre-trained models (Google Drive) for point cloud completion. The following is the suggested way to evaluate the performance of the pre-trained models.

# vanilla
python3 axformnet.py --mode test --method vanilla --ckpt_path path_to_ckpt_vanilla.pth

# integrated
python3 axformnet.py --mode test --method integrated --ckpt_path path_to_ckpt_integrated.pth

Visualization

Matplotlib is used for the visualization of results in the paper. Code for reference can be seen in utils/draw.py.

Here we recommend using Mitsuba 2 for visualization. An example code can be found in Point Cloud Renderer.

Citation

Please cite our work if you find it useful:

@article{zhang2021axform,
 title={Attention-based Transformation from Latent Features to Point Clouds},
 author={Zhang, Kaiyi and Yang, Ximing, and Wu, Yuan and Jin, Cheng},
 journal={arXiv preprint arXiv:2112.05324},
 year={2021}
}

License

This project Code is released under the MIT License (refer to the LICENSE file for details).

An implementation of Deep Forest 2021.2.1.

Deep Forest (DF) 21 DF21 is an implementation of Deep Forest 2021.2.1. It is designed to have the following advantages: Powerful: Better accuracy than

LAMDA Group, Nanjing University 795 Jan 03, 2023
Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali

163 Dec 22, 2022
Import Python modules from dicts and JSON formatted documents.

Paker Paker is module for importing Python packages/modules from dictionaries and JSON formatted documents. It was inspired by httpimporter. Important

Wojciech Wentland 1 Sep 07, 2022
利用python脚本实现微信、支付宝账单的合并,并保存到excel文件实现自动记账,可查看可视化图表。

KeepAccounts_v2.0 KeepAccounts.exe和其配套表格能够实现微信、支付宝官方导出账单的读取合并,为每笔帐标记类型,并按月份和类型生成可视化图表。再也不用消费一笔记一笔,每月仅需10分钟,记好所有的帐。 作者: MickLife Bilibili: https://spac

159 Jan 01, 2023
GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Guidedog Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma GuideDog is an AI/ML-based mobile app designed to assist the lives of the visua

Kyuhee Jo 5 Nov 24, 2021
A collection of models for image<->text generation in ACM MM 2021.

Bi-directional Image and Text Generation UMT-BITG (image & text generator) Unifying Multimodal Transformer for Bi-directional Image and Text Generatio

Multimedia Research 63 Oct 30, 2022
A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

Korbinian Pöppel 47 Nov 28, 2022
Tackling the Class Imbalance Problem of Deep Learning Based Head and Neck Organ Segmentation

Info This is the code repository of the work Tackling the Class Imbalance Problem of Deep Learning Based Head and Neck Organ Segmentation from Elias T

2 Apr 20, 2022
Table-Extractor 表格抽取

(t)able-(ex)tractor 本项目旨在实现pdf表格抽取。 Models 版面分析模块(Yolo) 表格结构抽取(ResNet + Transformer) 文字识别模块(CRNN + CTC Loss) Acknowledgements TableMaster attention-i

2 Jan 15, 2022
This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

DendroMap DendroMap is an interactive tool to explore large-scale image datasets used for machine learning. A deep understanding of your data can be v

DIV Lab 33 Dec 30, 2022
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions" Environment requirement This code is based on Python

Rohan Kumar Gupta 1 Dec 19, 2021
Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

Shang-Yi Chuang 85 Dec 01, 2022
A TensorFlow implementation of the Mnemonic Descent Method.

MDM A Tensorflow implementation of the Mnemonic Descent Method. Mnemonic Descent Method: A recurrent process applied for end-to-end face alignment G.

123 Oct 07, 2022
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,

Muhammed Kocabas 277 Jan 03, 2023
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attr

Google Research Datasets 89 Jan 08, 2023
Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

DALLE2 Video (wip) ** only to be built after DALLE2 image is done and replicated, and the importance of the prior network is validated ** Direct appli

Phil Wang 105 May 15, 2022
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
A hyperparameter optimization framework

Optuna: A hyperparameter optimization framework Website | Docs | Install Guide | Tutorial Optuna is an automatic hyperparameter optimization software

7.4k Jan 04, 2023
This repo is about implementing different approaches of pose estimation and also is a sub-task of the smart hospital bed project :smile:

Pose-Estimation This repo is a sub-task of the smart hospital bed project which is about implementing the task of pose estimation 😄 Many thanks to th

Max 11 Oct 17, 2022