This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Last update: Aug 19, 2022

Overview

Code-and-Dataset-for-CapSal

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019. Paper link

Our code is implemented based on the Mask RCNN in Tensorflow and Keras. You can first install the maskrcnn according to the instruction or INSTALL.md.

COCO-CapSal Dataset

The COCO-CapSal dataset provides the saliency ground truth as well as the image captions for each image. It contains 5265 images for training and 1459 ones for validation. The annotations can be downloaded at BaiduYun or GoogleDrive. The folder 'capsal' contains the images, ground truth maps as well as the caprions (json file) of both training and validation sets.

Evaluation

For testing the CapSal model, first download the trained model at BaiduYun or Google ) and put it under the ./model. Run test_capsal.py to obtain the saliency maps of different datasets. The saliency map is avaliable at Google or BaiduYun.

Train

Run 'train.py'.

Citation

    @InProceedings{Zhang_2019_CVPR,
            author = {Zhang, Lu and Zhang, Jianming and Lin, Zhe and Lu, Huchuan and He, You},
            title = {CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection},
            booktitle = CVPR,
            year = {2019}}

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Related tags

Overview

Code-and-Dataset-for-CapSal

COCO-CapSal Dataset

Evaluation

Train

Citation

Owner

lu zhang

SmartSim Infrastructure Library.

BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

Code repository for "Free View Synthesis", ECCV 2020.

Creative Applications of Deep Learning w/ Tensorflow

Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Neuron Merging: Compensating for Pruned Neurons (NeurIPS 2020)

Temporal-Relational CrossTransformers

Specification language for generating Generalized Linear Models (with or without mixed effects) from conceptual models

Select, weight and analyze complex sample data

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

An official source code for "Augmentation-Free Self-Supervised Learning on Graphs"

💡 Type hints for Numpy

PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/temporal/spatiotemporal databases

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

pq is a jq-like Pickle file viewer

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Deep Image Matting implementation in PyTorch

python 93% acc. CNN Dogs Vs Cats ( Pytorch )

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.