Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Overview

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

PointContrast

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

[CVPR 2021 Paper] [Video] [Project Page] [ScanNet Data-Efficient Benchmark]

Environment

This codebase was tested with the following environment configurations.

  • Ubuntu 20.04
  • CUDA 10.2
  • GCC 7.3.0
  • Python 3.7.7
  • PyTorch 1.5.1
  • MinkowskiEngine v0.4.3

Installation

We use conda for the installation process:

# Install virtual env and PyTorch
conda create -n sparseconv043 python=3.7
conda activate sparseconv043
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch

# Complie and install MinkowskiEngine 0.4.3.
conda install mkl mkl-include -c intel
wget https://github.com/NVIDIA/MinkowskiEngine/archive/refs/tags/v0.4.3.zip
cd MinkowskiEngine-0.4.3 
python setup.py install

Next, download Contrastive Scene Contexts git repository and install the requirement from the root directory.

git clone https://github.com/facebookresearch/ContrastiveSceneContexts.git
cd ContrastiveSceneContexts
pip install -r requirements.txt

Our code also depends on PointGroup and PointNet++.

# Install OPs in PointGroup by:
conda install -c bioconda google-sparsehash
cd downstream/semseg/lib/bfs/ops
python setup.py build_ext --include-dirs=YOUR_ENV_PATH/include
python setup.py install

# Install PointNet++
cd downstream/votenet/models/backbone/pointnet2
python setup.py install

Pre-training on ScanNet

Data Pre-processing

For pre-training, one can generate ScanNet Pair data by following code (need to change the TARGET and SCANNET_DIR accordingly in the script).

cd pretrain/scannet_pair
./preprocess.sh

This piece of code first extracts pointcloud from partial frames, and then computes a filelist of overlapped partial frames for each scene. Generate a combined txt file called overlap30.txt of filelists of each scene by running the code

cd pretrain/scannet_pair
python generate_list.py --target_dir TARGET

This overlap30.txt should be put into folder TARGET/splits.

Pre-training

Our codebase enables multi-gpu training with distributed data parallel (DDP) module in pytorch. To train PointContrast with 8 GPUs (batch_size=32, 4 per GPU) on a single server:

cd pretrain/contrastive_scene_contexts
# Pretrain with SparseConv backbone
OUT_DIR=./output DATASET=ROOT_PATH_OF_DATA scripts/pretrain_sparseconv.sh
# Pretrain with PointNet++ backbone
OUT_DIR=./output DATASET=ROOT_PATH_OF_DATA scripts/pretrain_pointnet2.sh

ScanNet Downstream Tasks

Data Pre-Processing

We provide the code for pre-processing the data for ScanNet downstream tasks. One can run following code to generate the training data for semantic segmentation and instance segmentation.

# Edit path variables, SCANNET_OUT_PATH
cd downstream/semseg/lib/datasets/preprocessing
python scannet.py

For ScanNet detection data generation, please refer to VoteNet ScanNet Data. Run command to soft link the generated detection data (located in PATH_DET_DATA) to following location:

# soft link detection data
cd downstream/det/
ln -s PATH_DET_DATA datasets/scannet/scannet_train_detection_data

For Data-Efficient Learning, download the scene_list and points_list as well as bbox_list from ScanNet Data-Efficient Benchmark. To Active Selection for points_list, run following code:

# Get features per point
cd downstream/semseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/inference_features.sh
# run k-means on feature space
cd lib
python sampling_points.py --point_data SCANNET_OUT_PATH --feat_data PATH_CHECKPOINT

Semantic Segmentation

We provide code for the semantic segmentation experiments conducted in our paper. Our code supports multi-gpu training. To train with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_scenes.sh

For Limited Points Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_INDS=PATH_SCENE_LIST ./scripts/data_efficient/by_points.sh

Model Zoo

We also provide our pre-trained checkpoints (and log file) for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to downloaded pre-trained model path:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh
Training Data mIoU (val) Initialization Pre-trained Model Logs Tensorboard
1% scenes 29.3 download download link link
5% scenes 45.4 download download link link
10% scenes 59.5 download download link link
20% scenes 64.1 download download link link
100% scenes 73.8 download download link link
20 points 53.8 download download link link
50 points 62.9 download download link link
100 points 66.9 download download link link
200 points 69.0 download download link link

Instance Segmentation

We provide code for the instance segmentation experiments conducted in our paper. Our code supports multi-gpu training. To train with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_scenes.sh

For Limited Points Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_INDS=PATH_POINTS_LIST ./scripts/data_efficient/by_points.sh

For ScanNet Benchmark, run following code (train on train+val and evaluate on val):

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet_benchmark.sh

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh

For submitting to ScanNet Benchmark with our pre-trained model, run following command (the submission file is located in output/benchmark_instance):

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet_benchmark.sh
Training Data [email protected] (val) Initialization Pre-trained Model Logs Curves
1% scenes 12.3 download download link link
5% scenes 33.9 download download link link
10% scenes 45.3 download download link link
20% scenes 49.8 download download link link
100% scenes 59.4 download download link link
20 points 27.2 download download link link
50 points 35.7 download download link link
100 points 43.6 download download link link
200 points 50.4 download download link link
train + val 76.5 (64.8 on test) download download link link

3D Object Detection

We provide the code for 3D Object Detection downstream task. The code is adapted directly fron VoteNet. Additionally, we provide two backones, namely PointNet++ and SparseConv. To fine-tune the downstream task, run following command:

cd downstream/votenet/
# train sparseconv backbone
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh
# train pointnet++ backbone
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet_pointnet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_Scentrain_scannet.sh

For Limited Bbox Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_BBOX=PATH_BBOX_LIST ./scripts/data_efficient/by_bboxes.sh

For submitting to ScanNet Data-Efficient Benchmark, you can set "test.write_to_bencmark=True" in "downstream/votenet/scripts/test_scannet.sh" or "downstream/votenet/scripts/test_scannet_pointnet.sh"

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can evaluate our pre-trained model by running following code.

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh
Training Data [email protected] (val) [email protected] (val) Initialize Pre-trained Model Logs Curves
10% scenes 9.9 24.7 download download link link
20% scenes 21.4 41.4 download download link link
40% scenes 29.5 52.0 download download link link
80% scenes 36.3 56.3 download download link link
100% scenes 39.3 59.1 download download link link
100% scenes (PointNet++) 39.2 62.5 download download link link
1 bboxes 30.3 54.5 download download link link
2 bboxes 32.4 55.3 download download link link
4 bboxes 34.6 58.9 download download link link
7 bboxes 35.9 59.7 download download link link

Stanford 3D (S3DIS) Fine-tuning

Data Pre-Processing

We provide the code for pre-processing the data for Stanford3D (S3DIS) downstream tasks. One can run following code to generate the training data for semantic segmentation and instance segmentation.

# Edit path variables, STANFORD_3D_OUT_PATH
cd downstream/semseg/lib/datasets/preprocessing
python stanford.py

Semantic Segmentation

We provide code for the semantic segmentation experiments conducted in our paper. Our code supports multi-gpu training. To fine-tune with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_stanford3d.sh

Model Zoo

We provide our pre-trained model and log file for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/semseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_stanford3d.sh
Training Data mIoU (val) Initialization Pre-trained Model Logs Tensorboard
100% scenes 72.2 download download link link

Instance Segmentation

We provide code for the instance segmentation experiments conducted in our paper. Our code supports multi-gpu training. To fine-tune with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_stanford3d.sh

Model Zoo

We provide our pre-trained model and log file for reference. You can evaluate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_stanford3d.sh
Training Data [email protected] (val) Initialization Pre-trained Model Logs Tensorboard
100% scenes 63.4 download download link link

SUN-RGBD Fine-tuning

Data Pre-Processing

For SUN-RGBD detection data generation, please refer to VoteNet SUN-RGBD Data. To soft link generated SUN-RGBD detection data (SUN_RGBD_DATA_PATH) to following location, run the command:

cd downstream/det/datasets/sunrgbd
# soft link 
link -s SUN_RGBD_DATA_PATH/sunrgbd_pc_bbox_votes_50k_v1_train sunrgbd_pc_bbox_votes_50k_v1_train
link -s SUN_RGBD_DATA_PATH/sunrgbd_pc_bbox_votes_50k_v1_val sunrgbd_pc_bbox_votes_50k_v1_val

3D Object Detection

We provide the code for 3D Object Detection downstream task. The code is adapted directly fron VoteNet. To fine-tune the downstream task, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_sunrgbd.sh

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can load our pre-trained model by setting the pre-trained model path to PATH_CHECKPOINT.

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_sunrgbd.sh
Training Data [email protected] (val) [email protected] (val) Initialize Pre-trained Model Log Curve
100% scenes 36.4 58.9 download download link link

Citing our paper

@article{hou2020exploring,
  title={Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts},
  author={Hou, Ji and Graham, Benjamin and Nie{\ss}ner, Matthias and Xie, Saining},
  journal={arXiv preprint arXiv:2012.09165},
  year={2020}
}

License

Contrastive Scene Contexts is relased under the MIT License. See the LICENSE file for more details.

Owner
Facebook Research
Facebook Research
Caffe-like explicit model constructor. C(onfig)Model

cmodel Caffe-like explicit model constructor. C(onfig)Model Installation pip install git+https://github.com/bonlime/cmodel Usage In order to allow usi

1 Feb 18, 2022
Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

Self-supervised Structure-sensitive Learning (SSL) Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin, "Look into Person: Self-supervised Structure-sensi

Clay Gong 219 Dec 29, 2022
Official repository for ABC-GAN

ABC-GAN The work represented in this repository is the result of a 14 week semesterthesis on photo-realistic image generation using generative adversa

IgorSusmelj 10 Jun 23, 2022
GndNet: Fast ground plane estimation and point cloud segmentation for autonomous vehicles using deep neural networks.

GndNet: Fast Ground plane Estimation and Point Cloud Segmentation for Autonomous Vehicles. Authors: Anshul Paigwar, Ozgur Erkent, David Sierra Gonzale

Anshul Paigwar 114 Dec 29, 2022
Coded illumination for improved lensless imaging

CodedCam Coded Illumination for Improved Lensless Imaging Paper | Supplementary results | Data and Code are available. Coded illumination for improved

Computational Sensing and Information Processing Lab 1 Nov 29, 2021
Multi-query Video Retreival

Multi-query Video Retreival

Princeton Visual AI Lab 17 Nov 22, 2022
Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

Thomas Roddick 219 Dec 20, 2022
Eff video representation - Efficient video representation through neural fields

Neural Residual Flow Fields for Efficient Video Representations 1. Download MPI

41 Jan 06, 2023
A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

SEAL ⠀⠀⠀ A PyTorch implementation of Semi-Supervised Graph Classification: A Hierarchical Graph Perspective (WWW 2019) Abstract Node classification an

Benedek Rozemberczki 202 Dec 27, 2022
constructing maps of intellectual influence from publication data

Influencemap Project @ ANU Influence in the academic communities has been an area of interest for researchers. This can be seen in the popularity of a

CS Metrics 13 Jun 18, 2022
NeuralCompression is a Python repository dedicated to research of neural networks that compress data

NeuralCompression is a Python repository dedicated to research of neural networks that compress data. The repository includes tools such as JAX-based entropy coders, image compression models, video c

Facebook Research 297 Jan 06, 2023
Simple tools for logging and visualizing, loading and training

TNT TNT is a library providing powerful dataloading, logging and visualization utilities for Python. It is closely integrated with PyTorch and is desi

1.5k Jan 02, 2023
FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

FinRL-Meta: A Universe of Market Environments. FinRL-Meta is a universe of market environments for data-driven financial reinforcement learning. Users

AI4Finance Foundation 543 Jan 08, 2023
Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

Semi-supervised Transfer Learning for Image Rain Removal This package contains the Python implementation of "Semi-supervised Transfer Learning for Ima

Wei Wei 59 Dec 26, 2022
Augmentation for Single-Image-Super-Resolution

SRAugmentation Augmentation for Single-Image-Super-Resolution Implimentation CutBlur Cutout CutMix Cutup CutMixup Blend RGBPermutation Identity OneOf

Yubo 6 Jun 27, 2022
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI G

Robin Henry 99 Dec 12, 2022
PyTorch implementation of "Dataset Knowledge Transfer for Class-Incremental Learning Without Memory" (WACV2022)

Dataset Knowledge Transfer for Class-Incremental Learning Without Memory [Paper] [Slides] Summary Introduction Installation Reproducing results Citati

Habib Slim 5 Dec 05, 2022
[ICLR2021oral] Rethinking Architecture Selection in Differentiable NAS

DARTS-PT Code accompanying the paper ICLR'2021: Rethinking Architecture Selection in Differentiable NAS Ruochen Wang, Minhao Cheng, Xiangning Chen, Xi

Ruochen Wang 86 Dec 27, 2022
UT-Sarulab MOS prediction system using SSL models

UTMOS: UTokyo-SaruLab MOS Prediction System Official implementation of "UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022" submitted to INTERSP

sarulab-speech 58 Nov 22, 2022