Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

Related tags

Deep LearningPMF
Overview

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation (ICCV 2021)

[中文|EN]

概述

本工作主要探索一种高效的多传感器(激光雷达和摄像头)融合点云语义分割方法。现有的多传感器融合方法主要将点云投影到图像上,获取对应的像素位置之后,将对应位置的图像信息投影回点云空间进行特征融合。但是,这种方式下并不能很好的利用图像丰富的视觉感知特征(例如形状、纹理等)。因此,我们尝试探索一种在RGB图像空间进行特征融合的方式,提出了一个基于视觉感知的多传感器融合方法(PMF)。详细内容可以查看我们的公开论文。

image-20211013141408045

主要实验结果

PWC

Leader board of [email protected]

image-20211013144333265

更多实验结果

我们在持续探索PMF框架的潜力,包括探索更大的模型、更好的ImageNet预训练模型、其他的数据集等。我们的实验结果证明了,PMF框架是易于拓展的,并且其性能可以通过使用更好的主干网络而实现提升。详细的说明可以查看文件

方法 数据集 mIoU (%)
PMF-ResNet34 SemanticKITTI Validation Set 63.9
PMF-ResNet34 nuScenes Validation Set 76.9
PMF-ResNet50 nuScenes Validation Set 79.4
PMF48-ResNet101 SensatUrban Test Set (ICCV2021 Competition) 66.2 (排名 5)

使用说明

注:代码中涉及到包括数据集在内的各种路径配置,请根据自己的实际路径进行修改

代码结构

|--- pc_processor/ 点云处理的Python包
	|--- checkpoint/ 生成实验结果目录
	|--- dataset/ 数据集处理
	|--- layers/ 常用网络层
	|--- loss/ 损失函数
	|--- metrices/ 模型性能指标函数
	|--- models/ 网络模型
	|--- postproc/ 后处理,主要是KNN
	|--- utils/ 其他函数
|--- tasks/ 实验任务
	|--- pmf/ PMF 训练源代码
	|--- pmf_eval_nuscenes/ PMF 模型在nuScenes评估代码
		|--- testset_eval/ 合并PMF以及salsanext结果并在nuScenes测试集上评估
		|--- xxx.py PMF 模型在nuScenes评估代码
	|--- pmf_eval_semantickitti/ PMF 在SemanticKITTI valset上评估代码
	|--- salsanext/ SalsaNext 训练代码,基于官方公开代码进行修改
	|--- salsanext_eval_nuscenes/ SalsaNext 在nuScenes 数据集上评估代码

模型训练

训练任务代码目录结构

|--- pmf/
	|--- config_server_kitti.yaml SemanticKITTI数据集训练的配置脚本
	|--- config_server_nus.yaml nuScenes数据集训练的配置脚本
	|--- main.py 主函数
	|--- trainer.py 训练代码
	|--- option.py 配置解析代码
	|--- run.sh 执行脚本,需要 chmod+x 赋予可执行权限

步骤

  1. 进入 tasks/pmf目录,修改配置文件 config_server_kitti.yaml中数据集路径 data_root 为实际数据集路径。如果有需要可以修改gpubatch_size等参数
  2. 修改 run.sh 确保 nproc_per_node 的数值与yaml文件中配置的gpu数量一致
  3. 运行如下指令执行训练脚本
./run.sh
# 或者 bash run.sh
  1. 执行成功之后会在 PMF/experiments/PMF-SemanticKitti路径下自动生成实验日志文件,目录结构如下:
|--- log_dataset_network_xxxx/
	|--- checkpoint/ 训练断点文件以及最佳模型参数
	|--- code/ 代码备份
	|--- log/ 控制台输出日志以及配置文件副本
	|--- events.out.tfevents.xxx tensorboard文件

控制台输出内容如下,其中最后的输出时间为实验预估时间

image-20211013152939956

模型推理

模型推理代码目录结构

|--- pmf_eval_semantickitti/ SemanticKITTI评估代码
	|--- config_server_kitti.yaml 配置脚本
	|--- infer.py 推理脚本
	|--- option.py 配置解析脚本

步骤

  1. 进入 tasks/pmf_eval_semantickitti目录,修改配置文件 config_server_kitti.yaml中数据集路径 data_root 为实际数据集路径。修改pretrained_path指向训练生成的日志文件夹目录。
  2. 运行如下命令执行脚本
python infer.py config_server_kitti.yaml
  1. 运行成功之后,会在训练模型所在目录下生成评估结果日志文件,文件夹目录结构如下:
|--- PMF/experiments/PMF-SemanticKitti/log_xxxx/ 训练结果路径
	|--- Eval_xxxxx/ 评估结果路径
		|--- code/ 代码备份
		|--- log/ 控制台日志文件
		|--- pred/ 用于提交评估的文件

引用

@InProceedings{Zhuang_2021_ICCV,
    author    = {Zhuang, Zhuangwei and Li, Rong and Jia, Kui and Wang, Qicheng and Li, Yuanqing and Tan, Mingkui},
    title     = {Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {16280-16290}
}
Owner
ICE
Model compression; Object detection; Point cloud processing;
ICE
Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

HiFi-GAN+ This project is an unoffical implementation of the HiFi-GAN+ model for audio bandwidth extension, from the paper Bandwidth Extension is All

Brent M. Spell 134 Dec 30, 2022
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

Learning to Generate Grounded Visual Captions without Localization Supervision This is the PyTorch implementation of our paper: Learning to Generate G

Chih-Yao Ma 41 Nov 17, 2022
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

klein 125 Jan 03, 2023
[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition

[CVPR'22] Collaborative Transformers for Grounded Situation Recognition Paper | Model Checkpoint This is the official PyTorch implementation of Collab

Junhyeong Cho 29 Dec 10, 2022
Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

SPLASH: Semantic Parsing with Language Assistance from Humans SPLASH is dataset for the task of semantic parse correction with natural language feedba

Microsoft Research - Language and Information Technologies (MSR LIT) 35 Oct 31, 2022
1st place solution in CCF BDCI 2021 ULSEG challenge

1st place solution in CCF BDCI 2021 ULSEG challenge This is the source code of the 1st place solution for ultrasound image angioma segmentation task (

Chenxu Peng 30 Nov 22, 2022
K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)

KCP The official implementation of KCP: k Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching, accepted for p

Yu-Kai Lin 109 Dec 14, 2022
Logistic Bandit experiments. Official code for the paper "Jointly Efficient and Optimal Algorithms for Logistic Bandits".

Code for the paper Jointly Efficient and Optimal Algorithms for Logistic Bandits, by Louis Faury, Marc Abeille, Clément Calauzènes and Kwang-Sun Jun.

Faury Louis 1 Jan 22, 2022
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 03, 2022
Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

cookielee77 30 Jan 01, 2023
The official implementation of Theme Transformer

Theme Transformer This is the official implementation of Theme Transformer. Checkout our demo and paper : Demo | arXiv Environment: using python versi

Ian Shih 85 Dec 08, 2022
Hunt down social media accounts by username across social networks

Hunt down social media accounts by username across social networks Installation | Usage | Docker Notes | Contributing Installation # clone the repo $

1 Dec 14, 2021
Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

Find Line Detection (Image Processing) Identifying lanes of the road is very common task that human driver performs. It's important to keep the vehicl

LMF 4 Jun 21, 2022
StyleSwin: Transformer-based GAN for High-resolution Image Generation

StyleSwin This repo is the official implementation of "StyleSwin: Transformer-based GAN for High-resolution Image Generation". By Bowen Zhang, Shuyang

Microsoft 349 Dec 28, 2022
Optical Character Recognition + Instance Segmentation for russian and english languages

Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты

Gerasimov Maxim 21 Dec 19, 2022
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti

Michaël Fonder 76 Jan 03, 2023
Using LSTM write Tang poetry

本教程将通过一个示例对LSTM进行介绍。通过搭建训练LSTM网络,我们将训练一个模型来生成唐诗。本文将对该实现进行详尽的解释,并阐明此模型的工作方式和原因。并不需要过多专业知识,但是可能需要新手花一些时间来理解的模型训练的实际情况。为了节省时间,请尽量选择GPU进行训练。

56 Dec 15, 2022
This repo is to present various code demos on how to use our Graph4NLP library.

Deep Learning on Graphs for Natural Language Processing Demo The repository contains code examples for DLG4NLP tutorials at NAACL 2021, SIGIR 2021, KD

Graph4AI 143 Dec 23, 2022
Repository for Driving Style Recognition algorithms for Autonomous Vehicles

Driving Style Recognition Using Interval Type-2 Fuzzy Inference System and Multiple Experts Decision Making Created by Iago Pachêco Gomes at USP - ICM

Iago Gomes 9 Nov 28, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022