Pytorch implementation of the unsupervised object discovery method LOST.

Related tags

Deep LearningLOST
Overview

LOST

Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper:

Localizing Objects with Self-Supervised Transformers and no Labels [arXiv]
by Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet and Jean Ponce

LOST visualizations LOST visualizations


If you use the LOST code or framework in your research, please consider citing:

@article{LOST,
   title = {Localizing Objects with Self-Supervised Transformers and no Labels},
   author = {Oriane Sim\'eoni and Gilles Puy and Huy V. Vo and Simon Roburin and Spyros Gidaris and Andrei Bursuc and Patrick P\'erez and Renaud Marlet and Jean Ponce},
   journal = {arXiv preprint arXiv:2109.14279},
   month = {09},
   year = {2021}
}

Installation

Dependencies

This code was implemented with python 3.7, PyTorch 1.7.1 and CUDA 10.2. Please install PyTorch. In order to install the additionnal dependencies, please launch the following command:

pip install -r requirements.txt

Install DINO

This method is based on DINO paper. The framework can be installed using the following commands:

> __init__.py; cd ../; ">
git clone https://github.com/facebookresearch/dino.git
cd dino; 
touch __init__.py
echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../;

The code was made using the commit ba9edd1 of DINO repo (please rebase if breakage).

Apply LOST to one image

Following are scripts to apply LOST to an image defined via the image_path parameter and visualize the predictions (pred), the maps of the Figure 2 in the paper (fms) and the visulization of the seed expansion (seed_expansion). Box predictions are also stored in the output directory given by parameter output_dir.

python main_lost.py --image_path examples/VOC07_000236.jpg --visualize pred
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize fms
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize seed_expansion

Launching on datasets

Following are the different steps to reproduce the results of LOST presented in the paper.

PASCAL-VOC

Please download the PASCAL VOC07 and PASCAL VOC12 datasets (link) and put the data in the folder datasets. There should be the two subfolders: datasets/VOC2007 and datasets/VOC2012. In order to apply lost and compute corloc results (VOC07 61.9, VOC12 64.0), please launch:

python main_lost.py --dataset VOC07 --set trainval
python main_lost.py --dataset VOC12 --set trainval

COCO

Please download the COCO dataset and put the data in datasets/COCO. Results are provided given the 2014 annotations following previous works. The following command line allows you to get results on the subset of 20k images of the COCO dataset (corloc 50.7), following previous litterature. To be noted that the 20k images are a subset of the train set.

python main_lost.py --dataset COCO20k --set train

Different models

We have tested the method on different setups of the VIT model, corloc results are presented in the following table (more can be found in the paper).

arch pre-training dataset
VOC07 VOC12 COCO20k
ViT-S/16 DINO 61.9 64.0 50.7
ViT-S/8 DINO 55.5 57.0 49.5
ViT-B/16 DINO 60.1 63.3 50.0
ResNet50 DINO 36.8 42.7 26.5
ResNet50 Imagenet 33.5 39.1 25.5


Previous results on the dataset VOC07 can be obtained by launching:

python main_lost.py --dataset VOC07 --set trainval #VIT-S/16
python main_lost.py --dataset VOC07 --set trainval --patch_size 8 #VIT-S/8
python main_lost.py --dataset VOC07 --set trainval --arch vit_base #VIT-B/16
python main_lost.py --dataset VOC07 --set trainval --arch resnet50 #Resnet50/DINO
python main_lost.py --dataset VOC07 --set trainval --arch resnet50_imagenet #Resnet50/imagenet
Comments
  • Is LOST designed to perform well with DINO features specifically?

    Is LOST designed to perform well with DINO features specifically?

    I've replaced LOST's backbone (basically the dino weights) with the ones in CLIP, and it did not work well. But when switching back to dino weights, both ViT and ResNet50 backbone could generate good feature maps. Why would this happen?

    question 
    opened by zengyuy 3
  • Error in evaluation with Detectron2

    Error in evaluation with Detectron2

    Hi @osimeoni,

    Thank you for making the code available!

    When evaluating Detectron2 on VOC12 with the obtained pseudolables. I obtain the following error: AttributeError: "int object has no attribute 'value'. It seems that the coco_style_file is not registered by 'register_coco_instances' (see image underneath). Any idea how this can be fixed? Thanks.

    image

    opened by MarcVisions 2
  • Class-aware detection

    Class-aware detection

    Do you plan on releasing code for class-aware detection (i.e., to produce the results in Table 3 of https://arxiv.org/pdf/2109.14279.pdf)? I don't believe I see any of the necessary code for assigning object categories to boxes, but please correct me if I'm wrong.

    opened by gholste 2
  • Multi-object discovery

    Multi-object discovery

    HI, I have a confusion about the interesting work. How to perform multi-target discovery in the figure 1 (middle) of your paper? Any advice is greatly appreciated.

    question 
    opened by rgbd-zml 1
  • Lost not performing well using DINO with fine-tuning

    Lost not performing well using DINO with fine-tuning

    I’ve trained DINO’s model with my own Dataset, doing a finetuning on the ViT’s pre trained models of DINO. After a feel experiments I noticed that, every time that a epoch of the DINO’s finetune ran, the loss of the training reduce, however the IoU (the validation metric that we are using) of the bounding boxes generated by the LOST algorithm gets worse. Can anyone explain me why this is happening and how can I fix it?

    opened by ericyoshida 1
  • corLoc evaluation

    corLoc evaluation

    Hi @osimeoni . I am suspicious about the corLoc evaluation part in the code. The corLoc for each image is true whenever one of the ground truth objects is hit! https://github.com/valeoai/LOST/blob/2b678aca89c18aa79c56ec3f6d4a0b979a91608d/main_lost.py#L311 What about other objects? Is it right?

    opened by Mirsadeghi 1
Owner
Valeo.ai
The GitHub account of Valeo.ai
Valeo.ai
PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

(pytorch) Gen-LaneNet: a generalized and scalable approach for 3D lane detection Introduction This is a pytorch implementation of Gen-LaneNet, which p

Yuliang Guo 233 Jan 06, 2023
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Baselines for TrajNet++

TrajNet++ : The Trajectory Forecasting Framework PyTorch implementation of Human Trajectory Forecasting in Crowds: A Deep Learning Perspective TrajNet

VITA lab at EPFL 183 Jan 05, 2023
Basics of 2D and 3D Human Pose Estimation.

Human Pose Estimation 101 If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolv

Sudharshan Chandra Babu 293 Dec 14, 2022
Open & Efficient for Framework for Aspect-based Sentiment Analysis

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis Fast & Low Memory requirement & Enhanced implementation of Local Context F

YangHeng 567 Jan 07, 2023
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Microsoft 408 Dec 30, 2022
Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation (CVPR 2020)

Super-BPD for Fast Image Segmentation (CVPR 2020) Introduction We propose direction-based super-BPD, an alternative to superpixel, for fast generic im

189 Dec 07, 2022
To prepare an image processing model to classify the type of disaster based on the image dataset

Disaster Classificiation using CNNs bunnysaini/Disaster-Classificiation Goal To prepare an image processing model to classify the type of disaster bas

Bunny Saini 1 Jan 24, 2022
It's like Shape Editor in Maya but works with skeletons (transforms).

Skeleposer What is Skeleposer? Briefly, it's like Shape Editor in Maya, but works with transforms and joints. It can be used to make complex facial ri

Alexander Zagoruyko 1 Nov 11, 2022
Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

Welcome to spline - the pipeline tool Important note: Since change in my job I didn't had the chance to continue on this project. My main new project

Thomas Lehmann 29 Aug 22, 2022
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

SAFA: Structure Aware Face Animation (3DV2021) Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation. Getting Started

QiulinW 122 Dec 23, 2022
FLSim a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API

Federated Learning Simulator (FLSim) is a flexible, standalone core library that simulates FL settings with a minimal, easy-to-use API. FLSim is domain-agnostic and accommodates many use cases such a

Meta Research 162 Jan 02, 2023
This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction:Here! Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

EasyCV-Ellis 618 Dec 27, 2022
The official PyTorch code implementation of "Personalized Trajectory Prediction via Distribution Discrimination" in ICCV 2021.

Personalized Trajectory Prediction via Distribution Discrimination (DisDis) The official PyTorch code implementation of "Personalized Trajectory Predi

25 Dec 20, 2022
unet for image segmentation

Implementation of deep learning framework -- Unet, using Keras The architecture was inspired by U-Net: Convolutional Networks for Biomedical Image Seg

zhixuhao 4.1k Dec 31, 2022
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

3k Jan 02, 2023
Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction This is the code for the paper Combining E

Robotics and Perception Group 69 Dec 26, 2022
Autonomous racing with the Anki Overdrive

Anki Autonomous Racing Autonomous racing with the Anki Overdrive. Using the Overdrive-Python API (https://github.com/xerodotc/overdrive-python) develo

3 Dec 11, 2022
Texture mapping with variational auto-encoders

vae-textures This is an experiment with using variational autoencoders (VAEs) to perform mesh parameterization. This was also my first project using J

Alex Nichol 41 May 24, 2022
Real-Time Semantic Segmentation in Mobile device

Real-Time Semantic Segmentation in Mobile device This project is an example project of semantic segmentation for mobile real-time app. The architectur

708 Jan 01, 2023