PyTorch implementation of SQN based on CloserLook3D's encoder

Overview

SQN_pytorch

This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, check our SQN_tensorflow repo.

Caution: currently, this repo does not achieve a satisfactory result as the SQN paper reports. For performance details, check performance section.

The repo is still under development, with the aim of reaching the level of performance reported in the SQN paper.(Note: our SQN_tensorflow repo has slightly higher performance than this pytorch repo.)

Please open an issue, if you have any comments and suggestions for improving the model performance.

TODOs

  • implement the training strategy mentioned in the Appendix of the paper.
  • ablation study
  • benchmark weak supervision

Install python packages

The latest codes are tested on two Ubuntu settings:

  • Ubuntu 18.04, Nvidia 1080, CUDA 10.1, PyTorch 1.4 and Python 3.6
  • Ubuntu 18.04, Nvidia 3090, CUDA 11.3, PyTorch 1.4 and Python 3.6

For details setting up the development environment, check CloserLook3D Pytorch version. To facilitate settings, below I also provide my own bash script( install.sh ) to create a conda environment from scratch for this repo. (You may need tailor this script according to your own system)

#!/bin/bash
ENV_NAME='closerlook'
conda create –n $ENV_NAME python=3.6.10 -y
source activate $ENV_NAME
conda install -c anaconda pillow=6.2 -y
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch -y
conda install -c conda-forge opencv -y
pip3 install termcolor tensorboard h5py easydict

Datasets

Take S3DIS as an example.

Scene Segmentation on S3DIS

You can download the S3DIS dataset from here (4.8 GB). You only need to download the file named Stanford3dDataset_v1.2.zip, unzip and move (or link) it to data/S3DIS/Stanford3dDataset_v1.2. (same as the CloserLook3D repo setting.)

The file structure should look like:

<root>
├── cfgs
│   └── s3dis
├── data
│   └── S3DIS
│       └── Stanford3dDataset_v1.2
│           ├── Area_1
│           ├── Area_2
│           ├── Area_3
│           ├── Area_4
│           ├── Area_5
│           └── Area_6
├── init.sh
├── datasets
├── function
├── models
├── ops
└── utils

run prepare-s3dis-sqn.sh to preprocess the S3DIS dataset. This script will generate a processed folder with the below structure with five types of data, including: raw, sub-sampled point clouds for each area, KDtrees for each sub-sampled area, projection indices for each raw point over the sub-sampled area and weak labels for raw and sub-sampled point clouds (involving different weak proportion of the dataset, e.g., 0.1, 0.01, 0.001, etc.. Details check datasets/S3DIS_sqn.py and my summary notes in this file.

The processed folder is organized as follows:

<root>
├── data
│   └── S3DIS
│       └── Stanford3dDataset_v1.2
│           ├── Area_1
│           ├── Area_2
│           ├── Area_3
│           ├── Area_4
│           ├── Area_5
│           ├── Area_6
│           └── processed
│             ├── weak_label_0.01
│             ├── weak_label_1.0
│             ├── Area_1_0.040_sub.pkl
│             ├── Area_1.pkl
│             ├── ...(many other pkl files)

Compile custom CUDA operators

sh init.sh

Run

use the run-sqn.sh script for training or evaluation.

The core training script is as follows:

python -m torch.distributed.launch \
--master_port 1234567 \
--nproc_per_node ${num_gpu} \
function/train_s3dis_dist_sqn.py \
--dataset_name ${dataset_name} \
--cfg cfgs/${dataset_name}/pospool_xyz_avg_sqn.yaml \
--num_points ${num_points} \
--batch_size ${batch_size} \
--val_freq 20 \
--weak_ratio ${weak_ratio}

The core evaluation script is as follows:

python -m torch.distributed.launch \
--master_port 12346 \
--nproc_per_node 1 \
function/evaluate_s3dis_dist_sqn.py \
--cfg cfgs/s3dis/pospool_xyz_avg_sqn.yaml \
--load_path <checkpoint>
[--log_dir <log directory>]

Performance on S3DIS

The experiments are still in progress due to my slow GPU.

Model Weak ratio Performance (mIoU, %) Description
Official RandLA-Net 100% 63.0 Fully supervised method trained with full labels.
Official SQN 1/1000 61.4 This official SQN uses additional techniques to improve the performance, our replicaed SQN currently does not investigate this yet. Official SQN does not provide results of S3DIS under the weak ratio of 1/10 and 1/100
Our replicated SQN 1/10 51.4 Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.
Our replicated SQN 1/100 25.22 Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.
Our replicated SQN 1/1000 21.10 Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.

Acknowledgements

Our pytorch codes borrowed a lot from CloserLook3D and the custom trilinear interoplation CUDA ops are modified from erikwijmans's Pointnet2_PyTorch.

Citation

If you find our work useful in your research, please consider citing:

@article{pytorchpointnet++,
    Author = {YIN, Chao},
    Title = {SQN Pytorch implementation based on CloserLook3D's encoder},
    Journal = {https://github.com/PointCloudYC/SQN_pytorch},
    Year = {2021}
   }

@article{hu2021sqn,
    title={SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000x Fewer Labels},
    author={Hu, Qingyong and Yang, Bo and Fang, Guangchi and Guo, Yulan and Leonardis, Ales and Trigoni, Niki and Markham, Andrew},
    journal={arXiv preprint arXiv:2104.04891},
    year={2021}
  }
Owner
PointCloudYC
Ph.D candidate at HKUST, focus on point cloud processing and deep learning.
PointCloudYC
Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

An interactive visualization system designed to helps domain experts responsibly edit Generalized Additive Models (GAMs). For more information, check

InterpretML 83 Jan 04, 2023
😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t

Hugging Face 865 Dec 24, 2022
AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022
A collection of scripts I developed for personal and working projects.

A collection of scripts I developed for personal and working projects Table of contents Introduction Repository diagram structure List of scripts pyth

Gianluca Bianco 109 Dec 26, 2022
A standard framework for modelling Deep Learning Models for tabular data

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.

801 Jan 08, 2023
Cross View SLAM

Cross View SLAM This is the associated code and dataset repository for our paper I. D. Miller et al., "Any Way You Look at It: Semantic Crossview Loca

Ian D. Miller 99 Dec 09, 2022
Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng Prerequisites We have tested the code on Ubun

41 Dec 12, 2022
Generative Handwriting using LSTM Mixture Density Network with TensorFlow

Generative Handwriting Demo using TensorFlow An attempt to implement the random handwriting generation portion of Alex Graves' paper. See my blog post

hardmaru 686 Nov 24, 2022
Implementation of Bottleneck Transformer in Pytorch

Bottleneck Transformer - Pytorch Implementation of Bottleneck Transformer, SotA visual recognition model with convolution + attention that outperforms

Phil Wang 621 Jan 06, 2023
Vision Transformer and MLP-Mixer Architectures

Vision Transformer and MLP-Mixer Architectures Update (2.7.2021): Added the "When Vision Transformers Outperform ResNets..." paper, and SAM (Sharpness

Google Research 6.4k Jan 04, 2023
FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI 声明: 本项目仅限于学习交流,不可用于非法用途,包括但不限于:用于游戏外挂等,使用本项目产生的任何后果与本人无关! 简介 本项目基于yolov5,实现了一款FPS类游戏(CF、CSGO等)的自瞄AI,本项目旨在使用现

Fabian 246 Dec 28, 2022
This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation) Usage example python dynamic_inverted_softmax.py --sims_train

36 Dec 29, 2022
A command line simple note taking app

Why yet another note taking program? note was designed with a very specific target in mind: me, and my 2354 scraps of paper. It runs from the command

64 Nov 20, 2022
Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy. Now with tensorflow 1.0 support. Evaluation usa

Marcel R. 349 Aug 06, 2022
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

David Álvarez de la Torre 1 Dec 02, 2022
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Demo video: CVPR 2021 Oral: Single Channel Manipulation: Localized or attribu

Zongze Wu 267 Dec 30, 2022
Tensorflow 2 Object Detection API kurulumu, GPU desteği, custom model hazırlama

Tensorflow 2 Object Detection API Bu tutorial, TensorFlow 2.x'in kararlı sürümü olan TensorFlow 2.3'ye yöneliktir. Bu, görüntülerde / videoda nesne a

46 Nov 20, 2022
Official implementation of "Watermarking Images in Self-Supervised Latent-Spaces"

🔍 Watermarking Images in Self-Supervised Latent-Spaces PyTorch implementation and pretrained models for the paper. For details, see Watermarking Imag

Meta Research 32 Dec 13, 2022
The official code of Anisotropic Stroke Control for Multiple Artists Style Transfer

ASMA-GAN Anisotropic Stroke Control for Multiple Artists Style Transfer Proceedings of the 28th ACM International Conference on Multimedia The officia

Six_God 146 Nov 21, 2022