Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Last update: Dec 21, 2022

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

This repo is the official implementation of "DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion"

by Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li.

Prerequisites

Ubuntu 18
PyTorch 1.7.0
CUDA 10.1
Cudnn 7.5.1
Python 3.7
Numpy 1.17.3

Training

Please see launch_train.sh and launch_pretrain.sh for imagenet pretraining and sod training, respectively.

Testing

Please see launch_test.sh for testing on the sod benchmarks.

Main Results

Dataset	E_r	S_λ^mean	F_β^mean	M
DUT-RGBD	0.950	0.921	0.926	0.030
NJUD	0.923	0.903	0.901	0.039
NLPR	0.950	0.918	0.897	0.024
SSD	0.904	0.876	0.852	0.045
STEREO	0.933	0.904	0.898	0.036
LFSD	0.923	0.882	0.882	0.054
RGBD135	0.962	0.920	0.896	0.021

Saliency maps and Evaluation

All of the saliency maps mentioned in the paper are available on GoogleDrive or BaiduYun(code:juc2).

You can use the toolbox provided by jiwei0921 for evaluation.

Additionally, we also provide the saliency maps of the STERE-1000 and SIP dataset on BaiduYun(code:qxfw) for easy comparison.

Dataset	E_r	S_λ^mean	F_β^mean	M
STERE-1000	0.928	0.897	0.895	0.038
SIP	0.908	0.861	0.868	0.057

Citation

@inproceedings{Sun2021DeepRS,
  title={Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion},
  author={P. Sun and Wenhu Zhang and Huanyu Wang and Songyuan Li and Xi Li},
  journal={IEEE Conf. Comput. Vis. Pattern Recog.},
  year={2021}
}

License

The code is released under MIT License (see LICENSE file for details).

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Prerequisites

Training

Testing

Main Results

Saliency maps and Evaluation

Citation

License

Owner

如今我已剑指天涯

PyTorch Lightning implementation of Automatic Speech Recognition

An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Code repository for our paper regarding the L3D dataset.

Contrastive Learning of Structured World Models

It is modified Tensorflow 2.x version of Mask R-CNN

Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Real-time Object Detection for Streaming Perception, CVPR 2022

Implementation of Artificial Neural Network Algorithm

Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

PyTorch implementation of federated learning framework based on the acceleration of global momentum

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Bio-OFC gym implementation and Gym-Fly environment

Semantic Segmentation in Pytorch

Meaningful titles for tabs and PDF downloads! Also supports tab search.

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Interpretation of T cell states using reference single-cell atlases

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Vpw analyzer - A visual J1850 VPW analyzer written in Python