Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Overview

Scribble-Supervised LiDAR Semantic Segmentation

Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL).
Authors: Ozan Unal, Dengxin Dai, Luc Van Gool

Abstract: Densely annotating LiDAR point clouds remains too expensive and time-consuming to keep up with the ever growing volume of data. While current literature focuses on fully-supervised performance, developing efficient methods that take advantage of realistic weak supervision have yet to be explored. In this paper, we propose using scribbles to annotate LiDAR point clouds and release ScribbleKITTI, the first scribble-annotated dataset for LiDAR semantic segmentation. Furthermore, we present a pipeline to reduce the performance gap that arises when using such weak annotations. Our pipeline comprises of three stand-alone contributions that can be combined with any LiDAR semantic segmentation model to achieve up to 95.7% of the fully-supervised performance while using only 8% labeled points.


News

[2022-04] We release our training code with the Cylinder3D backbone.
[2022-03] Our paper is accepted to CVPR 2022 for an ORAL presentation!
[2022-03] We release ScribbleKITTI, the first scribble-annotated dataset for LiDAR semantic segmentation.


ScribbleKITTI

teaser

We annotate the train-split of SemanticKITTI based on KITTI which consists of 10 sequences, 19130 scans, 2349 million points. ScribbleKITTI contains 189 million labeled points corresponding to only 8.06% of the total point count. We choose SemanticKITTI for its current wide use and established benchmark. We retain the same 19 classes to encourage easy transitioning towards research into scribble-supervised LiDAR semantic segmentation.

Our scribble labels can be downloaded here (118.2MB).

Data organization

The data is organized in the format of SemanticKITTI. The dataset can be used with any existing dataloader by changing the label directory from labels to scribbles.

sequences/
    ├── 00/
    │   ├── scribbles/
    │   │     ├ 000000.label
    │   │     └ 000001.label
    ├── 01/
    ├── 02/
    .
    .
    └── 10/

Scribble-Supervised LiDAR Semantic Segmentation

pipeline

We develop a novel learning method for 3D semantic segmentation that directly exploits scribble annotated LiDAR data. We introduce three stand-alone contributions that can be combined with any 3D LiDAR segmentation model: a teacher-student consistency loss on unlabeled points, a self-training scheme designed for outdoor LiDAR scenes, and a novel descriptor that improves pseudo-label quality.

Specifically, we first introduce a weak form of supervision from unlabeled points via a consistency loss. Secondly, we strengthen this supervision by fixing the confident predictions of our model on the unlabeled points and employing self-training with pseudo-labels. The standard self-training strategy is however very prone to confirmation bias due to the long-tailed distribution of classes inherent in autonomous driving scenes and the large variation of point density across different ranges inherent in LiDAR data. To combat these, we develop a class-range-balanced pseudo-labeling strategy to uniformly sample target labels across all classes and ranges. Finally, to improve the quality of our pseudo-labels, we augment the input point cloud by using a novel descriptor that provides each point with the semantic prior about its local surrounding at multiple resolutions.

Putting these two contributions along with the mean teacher framework, our scribble-based pipeline achieves up to 95.7% relative performance of fully supervised training while using only 8% labeled points.

Installation

For the installation, we recommend setting up a virtual environment:

python -m venv ~/venv/scribblekitti
source ~/venv/scribblekitti/bin/activate
pip install -r requirements.txt

Futhermore install the following dependencies:

Data Preparation

Please follow the instructions from SemanticKITTI to download the dataset including the KITTI Odometry point cloud data. Download our scribble annotations and unzip in the same directory. Each sequence in the train-set (00-07, 09-10) should contain the velodyne, labels and scribbles directories.

Move the sequences folder into a new directoy called data/. Alternatively, edit the dataset: root_dir field of each config file to point to the sequences folder.

Training

The training of our method requires three steps as illustrated in the above figure: (1) training, where we utilize the PLS descriptors and the mean teacher framework to generate high quality pseudo-labels; (2) pseudo-labeling, where we fix the trained teacher models predictions in a class-range-balanced manner; (3) distillation, where we train on the generated psuedo-labels.

Step 1 can be trained as follows. The checkpoint for the trained first stage model can be downloaded here. (The resulting model will show slight improvements over the model presented in the paper with 86.38% mIoU on the fully-labeled train-set.)

python train.py --config_path config/training.yaml --dataset_config_path config/semantickitti.yaml

For Step 2, we first need to first save the intermediate results of our trained teacher model.
Warning: This step will initially create a save file training_results.h5 (27GB). This file can be deleted after generating the psuedo-labels.

python save.py --config_path config/training.yaml --dataset_config_path config/semantickitti.yaml --checkpoint_path STEP1/CKPT/PATH --save_dir SAVE/DIR

Next, we find the optimum threshold for each class-annuli pairing and generate pseudo-labels in a class-range balanced manner. The psuedo-labels will be saved in the same root directory as the scribble lables but under a new folder called crb. The generated pseudo-labels from our model can be downloaded here.

python crb.py --config_path config/crb.yaml --dataset_config_path config/semantickitti.yaml --save_dir SAVE/DIR

Step 3 can be trained as follows. The resulting model state_dict can be downloaded here (61.25% mIoU).

python train.py --config_path config/distillation.yaml --dataset_config_path config/semantickitti.yaml

Evaluation

The final model as well as the provided checkpoints for the distillation steps can be evaluated on the SemanticKITTI validation set as follows. Evaluating the model is not neccessary when doing in-house training as the evaluation takes place within the training script after every epoch. The best teacher mIoU is given by the val_best_miou metric in W&B.

python evaluate.py --config_path config/distillation.yaml --dataset_config_path config/semantickitti.yaml --ckpt_path STEP2/CKPT/PATH

Quick Access for Download Links:


Citation

If you use our dataset or our work in your research, please cite:

@InProceedings{Unal_2022_CVPR,
    author    = {Unal, Ozan and Dai, Dengxin and Van Gool, Luc},
    title     = {Scribble-Supervised LiDAR Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022},
}

Acknowledgements

We would like to additionally thank the authors the open source codebase Cylinder3D.

Woosung Choi 63 Nov 14, 2022
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 05, 2023
Official implementation of Influence-balanced Loss for Imbalanced Visual Classification in PyTorch.

Official implementation of Influence-balanced Loss for Imbalanced Visual Classification in PyTorch.

Seulki Park 70 Jan 03, 2023
Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

Wenqi Zhao 87 Dec 27, 2022
This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

AdapterHub 18 Dec 09, 2022
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed

fast-Bart Reduction of BART model size by 3X, and boost in inference speed up to 3X BART implementation of the fastT5 library (https://github.com/Ki6a

Siddharth Sharma 19 Dec 09, 2022
Bayesian optimization in PyTorch

BoTorch is a library for Bayesian Optimization built on PyTorch. BoTorch is currently in beta and under active development! Why BoTorch ? BoTorch Prov

2.5k Dec 31, 2022
Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure.

Event Queue Dialect Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure. Motivation The m

Cornell Capra 23 Dec 08, 2022
Joint parameterization and fitting of stroke clusters

StrokeStrip: Joint Parameterization and Fitting of Stroke Clusters Dave Pagurek van Mossel1, Chenxi Liu1, Nicholas Vining1,2, Mikhail Bessmeltsev3, Al

Dave Pagurek 44 Dec 01, 2022
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Jinpeng Wang 150 Dec 28, 2022
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework. MXbox is a toolbox aiming to provide a general and simple interface for visi

Ligeng Zhu 31 Oct 19, 2019
The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
OSLO: Open Source framework for Large-scale transformer Optimization

O S L O Open Source framework for Large-scale transformer Optimization What's New: December 21, 2021 Released OSLO 1.0. What is OSLO about? OSLO is a

TUNiB 280 Nov 24, 2022
CaLiGraph Ontology as a Challenge for Semantic Reasoners ([email protected]'21)

CaLiGraph for Semantic Reasoning Evaluation Challenge This repository contains code and data to use CaLiGraph as a benchmark dataset in the Semantic R

Nico Heist 0 Jun 08, 2022
NEG loss implemented in pytorch

Pytorch Negative Sampling Loss Negative Sampling Loss implemented in PyTorch. Usage neg_loss = NEG_loss(num_classes, embedding_size) optimizer =

Daniil Gavrilov 123 Sep 13, 2022
PyTorch implementation of Barlow Twins.

Barlow Twins: Self-Supervised Learning via Redundancy Reduction PyTorch implementation of Barlow Twins. @article{zbontar2021barlow, title={Barlow Tw

Facebook Research 839 Dec 29, 2022
GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

GANSketching in Jittor Implementation of (Sketch Your Own GAN) in Jittor(计图). Or

Bernard Tan 10 Jul 02, 2022
A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

Eugenio Herrera 175 Dec 29, 2022
OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021) Video demo We here provide a video demo from co

20 Nov 25, 2022