Large-Scale Unsupervised Object Discovery

Last update: Sep 19, 2022

Related tags

Deep Learning LOD

Overview

Large-Scale Unsupervised Object Discovery

Huy V. Vo, Elena Sizikova, Cordelia Schmid, Patrick Pérez, Jean Ponce [PDF]

We propose a novel ranking-based large-scale unsupervised object discovery algorithm that scales up to 1.7M images.

This repository contains code used in the paper.

Quantitative Results

Installation

Follow INSTALL.md and DATA.md to install LOD and prepare data for running it.

Run LOD on a small toy dataset

Follow GETTING_STARTED_small_dataset.md to run LOD with VGG16 features on a small subset of 60 images of Pascal VOC2007 dataset.

Getting Started

Follow GETTING_STARTED.md to run LOD with VGG16 features and GETTING_STARTED_OBOW.md with VGG16-based OBoW features on C20K dataset.

Citations

@inproceedings{Vo21LOD,
  title     = {Large-Scale Unsupervised Object Discovery},
  author    = {Vo, Huy V. and Sizikova, Elena and Schmid, 
               Cordelia and P{\'e}rez, Patrick and Ponce, Jean},
  booktitle = {Advances in Neural Information Processing Systems 34 (NeurIPS 2021)}
  year      = {2021},
}

Acknowledgments

This work was supported in part by the Inria/NYU collaboration, the Louis Vuitton/ENS chair on artificial intelligence and the French government under management of Agence Nationale de la Recherche as part of the “Investissements d’avenir” program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute). Elena Sizikova was supported by the Moore-Sloan Data Science Environment initiative (funded by the Alfred P. Sloan Foundation and the Gordon and Betty Moore Foundation) through the NYU Center for Data Science. Huy V. Vo was supported in part by a Valeo/Prairie CIFRE PhD Fellowship.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Large-Scale Unsupervised Object Discovery

Related tags

Overview

Large-Scale Unsupervised Object Discovery

Quantitative Results

Installation

Run LOD on a small toy dataset

Getting Started

Citations

Acknowledgments

License

Owner

Pyramid Scene Parsing Network, CVPR2017.

MT3: Multi-Task Multitrack Music Transcription

Repository for GNSS-based position estimation using a Deep Neural Network

PyTorch implementation of "LayoutTransformer: Layout Generation and Completion with Self-attention"

EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

Unofficial PyTorch implementation of the Adaptive Convolution architecture for image style transfer

[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Решения, подсказки, тесты и утилиты для тренировки по алгоритмам от Яндекса.

Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

VOGUE: Try-On by StyleGAN Interpolation Optimization

Local Multi-Head Channel Self-Attention for FER2013

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

NeRF Meta-Learning with PyTorch

Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Scalable Multi-Agent Reinforcement Learning