Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Overview

PGNet

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022,
CVPR 2022 (arXiv 2204.05041)

Abstract

Recent salient object detection (SOD) methods based on deep neural network have achieved remarkable performance. However, most of existing SOD models designed for low-resolution input perform poorly on high-resolution images due to the contradiction between the sampling depth and the receptive field size. Aiming at resolving this contradiction, we propose a novel one-stage framework called Pyramid Grafting Network (PGNet), using transformer and CNN backbone to extract features from different resolution images independently and then graft the features from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different models. We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions. To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research. Sufficient experiments on UHRSD and widely-used SOD datasets demonstrate that our method achieves superior performance compared to the state-of-the-art methods.

Ultra High-Resolution Saliency Detection Dataset

Visual display for sample in UHRSD dataset. Best viewd by clikcing and zooming in.

To relief the lack of high-resolution datasets for SOD, we contribute the Ultra High-Resolution for Saliency Detection (UHRSD) dataset with a total of 5,920 images in 4K(3840 × 2160) or higher resolution, including 4,932 images for training and 988 images for testing. A total of 5,920 images were manually selected from websites (e.g. Flickr Pixabay) with free copyright. Our dataset is diverse in terms of image scenes, with a balance of complex and simple salient objects of various size.

To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research.

  • Our UHRSD (Ultra High-Resolution Saliency Detection) Dataset:

We provide the original 4K version and the convenient 2K version of our UHRSD (Ultra High-Resolution Saliency Detection) Dataset for download: Google Drive

Usage

Requirements

  • Python 3.8
  • Pytorch 1.7.1
  • OpenCV
  • Numpy
  • Apex
  • Timm

Directory

The directory should be like this:

-- src 
-- model (saved model)
-- pre (pretrained model)
-- result (saliency maps)
-- data (train dataset and test dataset)
   |-- DUTS-TR+HR
   |   |-- image
   |   |-- mask
   |-- UHRSOD+HRSOD
   |   |--image
   |   |--mask
   ...
   

Train

cd src
./train.sh
  • We implement our method by PyTorch and conduct experiments on 2 NVIDIA 2080Ti GPUs.
  • We adopt pre-trained ResNet-18 and Swin-B-224 as backbone networks, which are saved in PRE folder.
  • We train our method on 3 settings : DUTS-TR, DUTS-TR+HRSOD and UHRSD_TR+HRSOD_TR.
  • After training, the trained models will be saved in MODEL folder.

Test

The trained model can be download here: Google Drive

cd src
python test.py
  • After testing, saliency maps will be saved in RESULT folder

Saliency Map

Trained on DUTS-TR:Google Drive

Trained on DUT+HRSOD:Google Drive

Trained on UHRSD+HRSOD:Google Drive

Citation

@inproceedings{xie2022pyramid,
    author    = {Xie, Chenxi and Xia, Changqun and Ma, Mingcan and Zhao, Zhirui and Chen, Xiaowu and Li, Jia},
    title     = {Pyramid Grafting Network for One-Stage High Resolution Saliency Detection},
    booktitle = {CVPR},
    year      = {2022}
}
Owner
CVTEAM
CVTEAM
Data and codes for ACL 2021 paper: Towards Emotional Support Dialog Systems

Emotional-Support-Conversation Copyright © 2021 CoAI Group, Tsinghua University. All rights reserved. Data and codes are for academic research use onl

126 Dec 21, 2022
Structural Constraints on Information Content in Human Brain States

Structural Constraints on Information Content in Human Brain States Code accompanying the paper "The information content of brain states is explained

Leon Weninger 3 Sep 07, 2022
DUE: End-to-End Document Understanding Benchmark

This is the repository that provide tools to download data, reproduce the baseline results and evaluation. What can you achieve with this guide Based

21 Dec 29, 2022
LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations

LIMEcraft LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations The LIMEcraft algorithm is an explanatory method based on

MI^2 DataLab 4 Aug 01, 2022
Noether Networks: meta-learning useful conserved quantities

Noether Networks: meta-learning useful conserved quantities This repository contains the code necessary to reproduce experiments from "Noether Network

Dylan Doblar 33 Nov 23, 2022
Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Datasets"

Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Data

2 Oct 06, 2022
Image data augmentation scheduler for albumentations transforms

albu_scheduler Scheduler for albumentations transforms based on PyTorch schedulers interface Usage TransformMultiStepScheduler import albumentations a

19 Aug 04, 2021
Code for reproducible experiments presented in KSD Aggregated Goodness-of-fit Test.

Code for KSDAgg: a KSD aggregated goodness-of-fit test This GitHub repository contains the code for the reproducible experiments presented in our pape

Antonin Schrab 5 Dec 15, 2022
Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21

Self-Supervised Reward Regression (SSRR) Codebase for CoRL 2021 paper "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression "

19 Dec 12, 2022
Collaborative forensic timeline analysis

Timesketch Table of Contents About Timesketch Getting started Community Contributing About Timesketch Timesketch is an open-source tool for collaborat

Google 2.1k Dec 28, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 03, 2023
Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization Code for reproducing our results in the Head2Toe paper. Paper: arxiv.or

Google Research 62 Dec 12, 2022
This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!

Credit Card Fraud Detection Introduction Online transactions have become a crucial part of any business over the years. Many of those transactions use

Jonathan Hasbani 0 Jan 20, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022
Implementation of the state-of-the-art vision transformers with tensorflow

ViT Tensorflow This repository contains the tensorflow implementation of the state-of-the-art vision transformers (a category of computer vision model

Mohammadmahdi NouriBorji 2 Mar 16, 2022
[ICCV2021] Official Pytorch implementation for SDGZSL (Semantics Disentangling for Generalized Zero-Shot Learning)

Semantics Disentangling for Generalized Zero-shot Learning This is the official implementation for paper Zhi Chen, Yadan Luo, Ruihong Qiu, Zi Huang, J

25 Dec 06, 2022
A library that can print Python objects in human readable format

objprint A library that can print Python objects in human readable format Install pip install objprint Usage op Use op() (or objprint()) to print obj

319 Dec 25, 2022
Deep Learning as a Cloud API Service.

Deep API Deep Learning as Cloud APIs. This project provides pre-trained deep learning models as a cloud API service. A web interface is available as w

Wu Han 4 Jan 06, 2023
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.

The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store dev

George Rocha 0 Feb 03, 2022
💊 A 3D Generative Model for Structure-Based Drug Design (NeurIPS 2021)

A 3D Generative Model for Structure-Based Drug Design Coming soon... Citation @inproceedings{luo2021sbdd, title={A 3D Generative Model for Structu

Shitong Luo 118 Jan 05, 2023