The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Last update: Nov 17, 2022

Related tags

Deep Learning GCoNet

Overview

GCoNet

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Trained model

Download final_gconet.pth (Google Drive). And it is the training log.

Put final_gconet.pth at GCoNet/tmp/GCoNet_run1.

Run test.sh for evaluation.

Data Format

Put the DUTS_class (training dataset from GICD), CoCA, CoSOD3k and Cosal2015 datasets to GCoNet/data as the following structure:

GCoNet
   ├── other codes
   ├── ...
   │ 
   └── data
         ├──── images
         |       ├── DUTS_class (DUTS_class's image files)
         |       ├── CoCA (CoCA's image files)
         |       ├── CoSOD3k (CoSOD3k's image files)
         │       └── Cosal2015 (Cosal2015's image files)
         │ 
         └────── gts
                  ├── DUTS_class (DUTS_class's Groundtruth files)
                  ├── CoCA (CoCA's Groundtruth files)
                  ├── CoSOD3k (CoSOD3k's Groundtruth files)
                  └── Cosal2015 (Cosal2015's Groundtruth files)

Usage

Run sh all.sh for training (train_GPU0.sh) and testing (test.sh).

Prediction results

The co-saliency maps of GCoNet can be found at Google Drive.

Note and Discussion

In your training, you can usually obtain slightly worse performance on CoCA dataset and slightly better perofmance on Cosal2015 and CoSOD3k datasets. The performance fluctuation is around 1.0 point for Cosal2015 and CoSOD3k datasets and around 2.0 points for CoCA dataset.

We observed that the results on CoCA dataset are unstable when train the model multiple times, and the performance fluctuation can reach around 1.5 ponits (But our performance are still much better than other methods in the worst case).
Therefore, we provide our used training pairs and sequences with deterministic data augmentation to help you to reproduce our results on CoCA. (In different machines, these inputs and data augmentation are different but deterministic.) However, there is still randomness in the training stage, and you can obtain different performance on CoCA.

There are three possible reasons:

It may be caused by the challenging images of CoCA dataset where the target objects are relative small and there are many non-target objects in a complex environment.
The imperfect training dataset. We use the training dataset in GICD, whose labels are produced by the classification model. There are some noisy labels in the training dataset.
The randomness of training groups. In our training, two groups are randomly picked for training. Different collaborative training groups have different training difficulty.

Possible research directions for performance stability:

Reduce label noise. If you want to use the training dataset in GICD to train your model. It is better to use multiple powerful classification models (ensemble) to obtain better class labels.
Deterministic training groups. For two collaborative image groups, you can explore different ways to pick the suitable groups, e.g., pick two most similar groups for hard example mining.

It is a potential research direction to obtain stable results on such challenging real-world images. We follow other CoSOD methods to report the best performance of our model. You need to train the model multiple times to obtain the best result on CoCA dataset. If you want more discussion about it, you can contact me ([email protected]).

Citation

@inproceedings{fan2021gconet,
title={Group Collaborative Learning for Co-Salient Object Detection},
author={Fan, Qi and Fan, Deng-Ping and Fu, Huazhu and Tang, Chi-Keung and Shao, Ling and Tai, Yu-Wing},
booktitle={CVPR},
year={2021}
}

Acknowledgements

Zhao Zhang gives us lots of helps! Our framework is built on his GICD.

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Related tags

Overview

GCoNet

Trained model

Data Format

Usage

Prediction results

Note and Discussion

Citation

Acknowledgements

Owner

Qi Fan

Source code of the paper Meta-learning with an Adaptive Task Scheduler.

Chainer implementation of recent GAN variants

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Simple streamlit app to demonstrate HERE Tour Planning

Examples of using f2py to get high-speed Fortran integrated with Python easily

The modify PyTorch version of Siam-trackers which are speed-up by TensorRT.

The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

A Moonraker plug-in for real-time compensation of frame thermal expansion

[WWW 2021] Source code for "Graph Contrastive Learning with Adaptive Augmentation"

IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Datasets"

Data Engineering ZoomCamp

PyTorch implementation of Higher Order Recurrent Space-Time Transformer

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data