Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

Overview

A Self-Supervised Descriptor for Image Copy Detection (SSCD)

This is the open-source codebase for "A Self-Supervised Descriptor for Image Copy Detection", recently accepted to CVPR 2022.

This work uses self-supervised contrastive learning with strong differential entropy regularization to create a fingerprint for image copy detection.

SSCD diagram

About this codebase

This implementation is built on Pytorch Lightning, with some components from Classy Vision.

Our original experiments were conducted in a proprietary codebase using data files (fonts and emoji) that are not licensed for redistribution. This version uses Noto fonts and Twemoji emoji, via the AugLy project. As a result, models trained in this codebase perform slightly differently than our pretrained models.

Pretrained models

We provide trained models from our original experiments to allow others to reproduce our evaluation results.

For convenience, we provide equivalent model files in a few formats:

  • Files ending in .classy.pt are weight files using Classy Vision ResNe(X)t backbones, which is how these models were trained.
  • Files ending in .torchvision.pt are weight files using Torchvision ResNet backbones. These files may be easier to integrate in Torchvision-based codebases. See model.py for how we integrate GeM pooling and L2 normalization into these models.
  • Files ending in .torchscript.pt are standalone TorchScript models that can be used in any pytorch project without any SSCD code.

We provide the following models:

name dataset trunk augmentations dimensions classy vision torchvision torchscript
sscd_disc_blur DISC ResNet50 strong blur 512 link link link
sscd_disc_advanced DISC ResNet50 advanced 512 link link link
sscd_disc_mixup DISC ResNet50 advanced + mixup 512 link link link
sscd_disc_large DISC ResNeXt101 32x4 advanced + mixup 1024 link link
sscd_imagenet_blur ImageNet ResNet50 strong blur 512 link link link
sscd_imagenet_advanced ImageNet ResNet50 advanced 512 link link link
sscd_imagenet_mixup ImageNet ResNet50 advanced + mixup 512 link link link

We recommend sscd_disc_mixup (ResNet50) as a default SSCD model, especially when comparing to other standard ResNet50 models, and sscd_disc_large (ResNeXt101) as a higher accuracy alternative using a bit more compute.

Classy Vision and Torchvision use different default cardinality settings for ResNeXt101. We do not provide a Torchvision version of the sscd_disc_large model for this reason.

Installation

If you only plan to use torchscript models for inference, no installation steps are necessary, and any environment with a recent version of pytorch installed can run our torchscript models.

For all other uses, see installation steps below.

The code is written for pytorch-lightning 1.5 (the latest version at time of writing), and may need changes for future Lightning versions.

Option 1: Install dependencies using Conda

Install and activate conda, then create a conda environment for SSCD as follows:

# Create conda environment
conda create --name sscd -c pytorch -c conda-forge \
  pytorch torchvision cudatoolkit=11.3 \
  "pytorch-lightning>=1.5,<1.6" lightning-bolts \
  faiss python-magic pandas numpy

# Activate environment
conda activate sscd

# Install Classy Vision and AugLy from PIP:
python -m pip install classy_vision augly

You may need to select a cudatoolkit version that corresponds to the system CUDA library version you have installed. See PyTorch documentation for supported combinations of pytorch, torchvision and cudatoolkit versions.

For a non-CUDA (CPU only) installation, replace cudatoolkit=... with cpuonly.

Option 2: Install dependencies using PIP

# Create environment
python3 -m virtualenv ./venv

# Activate environment
source ./venv/bin/activate

# Install dependencies in this environment
python -m pip install -r ./requirements.txt --extra-index-url https://download.pytorch.org/whl/cu113

The --extra-index-url option selects a newer version of CUDA libraries, required for NVidia A100 GPUs. This can be omitted if A100 support is not needed.

Inference using SSCD models

This section describes how to use pretrained SSCD models for inference. To perform inference for DISC and Copydays evaluations, see Evaluation.

Preprocessing

We recommend preprocessing images for inference either resizing the small edge to 288 or resizing the image to a square tensor.

Using fixed-sized square tensors is more efficient on GPUs, to make better use of batching. Copy detection using square tensors benefits from directly resizing to the target tensor size. This skews the image, and does not preserve aspect ratio. This differs from the common practice for classification inference.

from torchvision import transforms

normalize = transforms.Normalize(
    mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225],
)
small_288 = transforms.Compose([
    transforms.Resize(288),
    transforms.ToTensor(),
    normalize,
])
skew_320 = transforms.Compose([
    transforms.Resize([320, 320]),
    transforms.ToTensor(),
    normalize,
])

Inference using Torchscript

Torchscript files can be loaded directly in other projects without any SSCD code or dependencies.

import torch
from PIL import Image

model = torch.jit.load("/path/to/sscd_disc_mixup.torchscript.pt")
img = Image.open("/path/to/image.png").convert('RGB')
batch = small_288(img).unsqueeze(0)
embedding = model(batch)[0, :]

These Torchscript models are prepared for inference. For other uses (eg. fine-tuning), use model weight files, as described below.

Load model weight files

To load model weight files, first construct the Model object, then load the weights using the standard torch.load and load_state_dict methods.

import torch
from sscd.models.model import Model

model = Model("CV_RESNET50", 512, 3.0)
weights = torch.load("/path/to/sscd_disc_mixup.classy.pt")
model.load_state_dict(weights)
model.eval()

Once loaded, these models can be used interchangeably with Torchscript models for inference.

Model backbone strings can be found in the Backbone enum in model.py. Classy Vision models start with the prefix CV_ and Torchvision models start with TV_.

Using SSCD descriptors

SSCD models produce 512 dimension (except the "large" model, which uses 1024 dimensions) L2 normalized descriptors for each input image. The similarity of two images with descriptors a and b can be measured by descriptor cosine similarity (a.dot(b); higher is more similar), or equivalently using euclidean distance ((a-b).norm(); lower is more similar).

For the sscd_disc_mixup model, DISC image pairs with embedding cosine similarity greater than 0.75 are copies with 90% precision, for example. This corresponds to a euclidean distance less than 0.7, or squared euclidean distance less than 0.5.

Descriptor post-processing

For best results, we recommend additional descriptor processing when sample images from the target distribution are available. Centering (subtracting the mean) followed by L2 normalization, or whitening followed by L2 normalization, can improve accuracy.

Score normalization can make similarity more consistent and improve global accuracy metrics (but has no effect on ranking metrics).

Other model formats

If pretrained models in another format (eg. ONYX) would be useful for you, let us know by filing a feature request.

Reproducing evaluation results

To reproduce evaluation results, see Evaluation.

Training SSCD models

For information on how to train SSCD models, see Training.

License

The SSCD codebase uses the CC-NC 4.0 International license.

Citation

If you find our codebase useful, please consider giving a star and cite as:

@article{pizzi2022self,
  title={A Self-Supervised Descriptor for Image Copy Detection},
  author={Pizzi, Ed and Roy, Sreya Dutta and Ravindra, Sugosh Nagavara and Goyal, Priya and Douze, Matthijs},
  journal={Proc. CVPR},
  year={2022}
}
Owner
Meta Research
Meta Research
Pytorch code for semantic segmentation using ERFNet

ERFNet (PyTorch version) This code is a toolbox that uses PyTorch for training and evaluating the ERFNet architecture for semantic segmentation. For t

Edu 394 Jan 01, 2023
Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation

Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation Overview This example will show how to validate the status of our firewall before and a

Calvin Remsburg 1 Jan 07, 2022
This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search"

InvariantAncestrySearch This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search

Phillip Bredahl Mogensen 0 Feb 02, 2022
Adversarial Self-Defense for Cycle-Consistent GANs

Adversarial Self-Defense for Cycle-Consistent GANs This is the official implementation of the CycleGAN robust to self-adversarial attacks used in pape

Dina Bashkirova 10 Oct 10, 2022
Mask-invariant Face Recognition through Template-level Knowledge Distillation

Mask-invariant Face Recognition through Template-level Knowledge Distillation This is the official repository of "Mask-invariant Face Recognition thro

Fadi Boutros 35 Dec 06, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

RESA PyTorch implementation of the paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection". Our paper has been accepted by AAAI2021. Intro

137 Jan 02, 2023
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

English | 简体中文 Easy Parallel Library Overview Easy Parallel Library (EPL) is a general and efficient library for distributed model training. Usability

Alibaba 185 Dec 21, 2022
Cereal box identification in store shelves using computer vision and a single train image per model.

Product Recognition on Store Shelves Description You can read the task description here. Report You can read and download our report here. Step A - Mu

Nicholas Baraghini 1 Jan 21, 2022
Self-driving car env with PPO algorithm from stable baseline3

Self-driving car with RL stable baseline3 Most of the project develop from https://github.com/GerardMaggiolino/Gym-Medium-Post Please check it out! Th

Sornsiri.P 7 Dec 22, 2022
This repo generates the training data and the model for Morpheus-Deblend

Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as

Ryan Hausen 2 Apr 18, 2022
A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

Somnus `Chen 2 Jun 09, 2022
A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️

hf-hub-lightning A callback for pushing lightning models to the Hugging Face Hub. Note: I made this package for myself, mostly...if folks seem to be i

Nathan Raw 27 Dec 14, 2022
Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

PS-SC GAN This repository contains the main code for training a PS-SC GAN (a GAN implemented with the Perceptual Simplicity and Spatial Constriction c

Xinqi/Steven Zhu 40 Dec 16, 2022
An open-source outlier detection package by Getcontact Data Team

pyfbad The pyfbad library supports anomaly detection projects. An end-to-end anomaly detection application can be written using the source codes of th

Teknasyon Tech 41 Dec 27, 2022
Code accompanying "Dynamic Neural Relational Inference" from CVPR 2020

Code accompanying "Dynamic Neural Relational Inference" This codebase accompanies the paper "Dynamic Neural Relational Inference" from CVPR 2020. This

Colin Graber 48 Dec 23, 2022
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022
Find-Lane-Line - Use openCV library and Python to detect the road-lane-line

Find-Lane-Line This project is to use openCV library and Python to detect the road-lane-line. Data Pipeline Step one : Color Selection Step two : Cann

Kenny Cheng 3 Aug 17, 2022
Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

Active Learning for Deep Object Detection via Probabilistic Modeling This repository is the official PyTorch implementation of Active Learning for Dee

NVIDIA Research Projects 130 Jan 06, 2023
Dynamica causal Bayesian optimisation

Dynamic Causal Bayesian Optimization This is a Python implementation of Dynamic Causal Bayesian Optimization as presented at NeurIPS 2021. Abstract Th

nd308 18 Nov 22, 2022