Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

Related tags

Deep LearningCSA
Overview

CSA: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking

PyTorch training code for CSA (Contextual Similarity Aggregation). We propose a visual re-ranking method by contextual similarity aggregation with transformer, obtaining 80.3 mAP on ROxf with Medium evaluation protocols. Inference in 50 lines of PyTorch.

CSA

What it is. Unlike traditional visual reranking techniques, CSA uses the similarity between the image and the anchor image as a representation of the image, which is defined as affinity feature. It consists of a contrastive loss that forces the relevant images to have larger cosine similarity and vice versa, an MSE loss that preserves the information of the original affinity features, and a Transformer encoder architecture. Given ranking list returned by the first-round retrieval, CSA first choose the top-L images in ranking list as the anchor images and calculates the affinity features of the top-K candidates,then dynamically refine the affinity features of different candiates in parallel. Due to this parallel nature, CSA is very fast and efficient.

About the code. CSA is very simple to implement and experiment with, and we provide a Notebook showing how to do inference with CSA in only a few lines of PyTorch code. Training code follows this idea - it is not a library, but simply a train.py importing model and criterion definitions with standard training loops.

mAP performance of the proposed model

We provide results of baseline CSA and CSA trained with data augmentation. mAP is computed with Medium and Hard evaluation protocols. model will come soon. CSA

Requirements

  • Python 3
  • PyTorch tested on 1.7.1+, torchvision 0.8.2+
  • numpy
  • matplotlib

Usage - Visual Re-ranking

There are no extra compiled components in CSA and package dependencies are minimal, so the code is very simple to use. We provide instructions how to install dependencies via conda. Install PyTorch 1.7.1+ and torchvision 0.8.2+:

conda install -c pytorch pytorch torchvision

Data preparation

Before going further, please check out Filip Radenovic's great repository on image retrieval. We use his code and model to extract features for training images. If you use this code in your research, please also cite their work! link to license

Download and extract rSfm120k train and val images with annotations from http://cmp.felk.cvut.cz/cnnimageretrieval/.

Download ROxf and RPar datastes with annotations. Prepare features for testing and training images with Filip Radenovic's model and code. We expect the directory structure to be the following:

path/to/data/
  ├─ annotations # annotation pkl files
  │   ├─ retrieval-SfM-120k.pkl
  │   ├─ roxford5k
  |   |   ├─ gnd_roxford5k.mat
  |   |   └─ gnd_roxford5k.pkl
  |   └─ rparis6k
  |   |   ├─ gnd_rparis6k.mat
  |   |   └─ gnd_rparis6k.pkl
  ├─ test # test features		
  |   ├─ r1m
  |   |   ├─ gl18-tl-resnet101-gem-w.pkl
  |   |   └─ rSfM120k-tl-resnet101-gem-w.pkl
  │   ├─ roxford5k
  |   |   ├─ gl18-tl-resnet101-gem-w.pkl
  |   |   └─ rSfM120k-tl-resnet101-gem-w.pkl
  |   └─ rparis6k
  |   |   ├─ gl18-tl-resnet101-gem-w.pkl
  |   |   └─ rSfM120k-tl-resnet101-gem-w.pkl
  └─ train # train features
      ├─ gl18-tl-resnet50-gem-w.pkl
      ├─ gl18-tl-resnet101-gem-w.pkl
      └─ gl18-tl-resnet152-gem-w.pkl

Training

To train baseline CSA on a single node with 4 gpus for 100 epochs run:

sh experiment_rSfm120k.sh

A single epoch takes 10 minutes, so 100 epoch training takes around 17 hours on a single machine with 4 2080Ti cards. To ease reproduction of our results we provide results and training logs for 200 epoch schedule (34 hours on a single machine).

We train CSA with SGD setting learning rate in the transformer to 0.1. The transformer is trained with dropout of 0.1, and the whole model is trained with grad clip of 1.0. To train CSA with data augmentation a single node with 4 gpus for 100 epochs run:

sh experiment_augrSfm120k.sh

Evaluation

To evaluate CSA on Roxf and Rparis with a single GPU run:

sh test.sh

and get results as below

>> Test Dataset: roxford5k *** fist-stage >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 67.3, Hard: 44.24
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [95.71 90.29 84.57], Hard: [87.14 69.71 59.86]

>> Test Dataset: roxford5k *** rerank-topk1024 >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 77.92, Hard: 58.43
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [94.29 93.14 89.71], Hard: [87.14 83.43 73.14]

>> Test Dataset: rparis6k *** fist-stage >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 80.57, Hard: 61.46
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [100.    98.    96.86], Hard: [97.14 93.14 90.57]

>> Test Dataset: rparis6k *** query-rerank-1024 >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 87.2, Hard: 74.41
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [100.    97.14  96.57], Hard: [95.71 92.86 90.14]

Qualitative examples

Selected qualitative examples of our re-ranking method. Top-10 results are shown in the figure. The figure is divided into four groups which consist of a result of initial retrieval and a result of our re-ranking method. The first two groups are the successful cases and the other two groups arethe failed cases. The images on the left with orange bounding boxes are the queries. The image with green denotes the true positives and the red bounding boxes are false positives. CSA

License

CSA is released under the MIT license. Please see the LICENSE file for more information.

Owner
Hui Wu
Department of Electronic Engineering and Information Science University of Science and Technology of China
Hui Wu
[ECCV'20] Convolutional Occupancy Networks

Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page | Blog Post This repository contains the implementation o

622 Dec 30, 2022
Airborne Optical Sectioning (AOS) is a wide synthetic-aperture imaging technique

AOS: Airborne Optical Sectioning Airborne Optical Sectioning (AOS) is a wide synthetic-aperture imaging technique that employs manned or unmanned airc

JKU Linz, Institute of Computer Graphics 39 Dec 09, 2022
Mercury: easily convert Python notebook to web app and share with others

Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad

MLJAR 2.2k Dec 27, 2022
Official implementation of Long-Short Transformer in PyTorch.

Long-Short Transformer (Transformer-LS) This repository hosts the code and models for the paper: Long-Short Transformer: Efficient Transformers for La

NVIDIA Corporation 198 Dec 29, 2022
Simulating Sycamore quantum circuits classically using tensor network algorithm.

Simulating the Sycamore quantum supremacy circuit This repo contains data we have obtained in simulating the Sycamore quantum supremacy circuits with

Feng Pan 46 Nov 17, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

Tianhong Dai 5 Nov 16, 2022
Some experiments with tennis player aging curves using Hilbert space GPs in PyMC. Only experimental for now.

NOTE: This is still being developed! Setup notes This document uses Jeff Sackmann's tennis data. You can obtain it as follows: git clone https://githu

Martin Ingram 1 Jan 20, 2022
S-attack library. Official implementation of two papers "Are socially-aware trajectory prediction models really socially-aware?" and "Vehicle trajectory prediction works, but not everywhere".

S-attack library: A library for evaluating trajectory prediction models This library contains two research projects to assess the trajectory predictio

VITA lab at EPFL 71 Jan 04, 2023
Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can ! 🤡

Customers Segmentation using PHP and Rubix ML PHP Library Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can !

Mickaël Andrieu 11 Oct 08, 2022
Generating Radiology Reports via Memory-driven Transformer

R2Gen This is the implementation of Generating Radiology Reports via Memory-driven Transformer at EMNLP-2020. Citations If you use or extend our work,

CUHK-SZ NLP Group 101 Dec 13, 2022
Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields"

NeRF++ Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields" Work with 360 capture of large-scale unbounded scenes. Sup

Kai Zhang 722 Dec 28, 2022
X-VLM: Multi-Grained Vision Language Pre-Training

X-VLM: learning multi-grained vision language alignments Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts. Yan Zeng, Xi

Yan Zeng 286 Dec 23, 2022
StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation

StyleGAN2 with adaptive discriminator augmentation (ADA) — Official TensorFlow implementation Training Generative Adversarial Networks with Limited Da

NVIDIA Research Projects 1.7k Dec 29, 2022
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Dec 27, 2022
PerfFuzz: Automatically Generate Pathological Inputs for C/C++ programs

PerfFuzz Performance problems in software can arise unexpectedly when programs are provided with inputs that exhibit pathological behavior. But how ca

Caroline Lemieux 125 Nov 18, 2022
Pytorch tutorials for Neural Style transfert

PyTorch Tutorials This tutorial is no longer maintained. Please use the official version: https://pytorch.org/tutorials/advanced/neural_style_tutorial

Alexis David Jacq 135 Jun 26, 2022
Code for our paper "MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction" published at ICCV 2021.

MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction This repository contains the code for the p

Sven 30 Jan 05, 2023
Speed-Test - You can check your intenet speed using this tool

Speed-Test Tool By Hez_X AVAILABLE ON : Termux & Kali linux & Ubuntu (Linux E

Hez-X 3 Feb 17, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

21 Nov 09, 2022