General Vision Benchmark, a project from OpenGVLab

Overview

Introduction

  • We build GV-B(General Vision Benchmark) on Classification, Detection, Segmentation and Depth Estimation including 26 datasets for model evaluation.
  • It is recommended to evaluate with low-data regime, using only 10% training data.
  • The parameters of model backbone will be frozen during training, as known as 'linear probe'.
  • Face Detection and Depth Estimation is not provided for now, you may evaluate via official repo if needed.
  • Specifically, we use central_model.py in our repo to represent the implementation of Up-G models.

Task Supported

  • Object Classification
  • Object Detection (VOC Detection)
  • Pedestrian Detection (CityPersons Detection)
  • Semantic Segmentation (VOC Segmentation)
  • Face Detection (WiderFace Detection)
  • Depth Estimation (Kitti/NYU-v2 Depth Estimation)

Installation

Requirements

Install Dependencies

a. Create a conda virtual environment and activate it.

conda create -n open-mmlab python=3.8 -y
conda activate open-mmlab

b. Install PyTorch and torchvision following the official instructions, e.g.:

conda install pytorch torchvision -c pytorch
Make sure that your compilation CUDA version and runtime CUDA version match.
You can check the supported CUDA version for precompiled packages on the
[PyTorch website](https://pytorch.org/).

c. Install openmm package via pip (mmcls, mmdet, mmseg):

pip install mmcls
pip install mmdet
pip install mmsegmetation

Usage

This section provide basic tutorials about the usage of GV-B.

Prepare datasets

For each evaluation task, you can follow the official repo tutorial for data preparation.

mmclassification

mmdetection

mmsegmentation

Model evaluation

We use MIM to submit evaluation in GV-B.

a.If you run MMClassification on a cluster managed with slurm, you can use the script mim_slurm_train.sh. (This script also supports single machine training.)

sh tools/mim_slurm_train.sh $PARTITION $TASK $CONFIG $WORK_DIR

b.If you run on w/o slurm. (More details can be found in docs of openmim)

PYTHONPATH='.':$PYTHONPATH mim train $TASK $CONFIG $WORK_DIR
  • PARTITION: The partition you are using
  • WORK_DIR: The directory to save logs and checkpoints
  • CONFIG: Config files corresponding to tasks.

Detailed Tutorials

Currently, we provide tutorials for users.

Benchmark(with Hyperparameter searching)

CLS DET SEG DEP
10% data Cifar10 Cifar100 Food Pets Flowers Sun Cars Dtd Caltech Aircraft Svhn Eurosat Resisc45 Retinopathy Fer2013 Ucf101 Gtsrb Pcam Imagenet Kinetics700 VOC07+12 WIDER FACE CityPersons VOC2012 KITTI NYUv2
Up-A R50 92.4 73.5 75.8 85.7 94.6 57.9 52.7 65.0 88.5 28.7 61.4 93.8 82.9 73.8 55.0 71.1 75.1 82.9 71.9 35.2 76.3 90.3/88.3/70.7 24.6/59.0 62.54 3.181 0.456
MN-B4 96.1 82.9 84.3 89.8 98.3 66.0 61.4 66.8 92.8 32.5 60.4 92.7 85.8 75.6 56.5 76.9 74.4 84.3 77.2 39.4 74.9 89.3/87.6/71.4 26.5/61.8 65.71 3.565 0.482
MN-B15 98.2 87.8 93.9 92.8 99.6 72.3 59.4 70.0 93.8 64.8 58.6 95.3 91.9 77.9 62.8 85.4 76.2 87.8 86.0 52.9 78.4 93.6/91.8/77.2 17.7/49.5 60.68 2.423 0.383
Up-E C-R50 91.9 71.2 80.7 88.8 94.0 57.4 67.9 62.7 85.5 73.9 57.6 93.7 83.6 75.4 54.1 69.6 73.9 85.7 72.5 34.6 72.2 89.7/87.6/68.1 22.4/58.3 57.66 3.214 0.501
D-R50 86.4 57.3 53.9 31.4 44.0 39.8 8.6 44.6 72.5 15.8 64.2 89.1 72.8 73.6 46.6 57.4 67.5 81.7 45.0 25.2 87.7 93.8/92.0/75.5 15.8/41.5 62.3 3.09 0.45
S-R50 78.3 46.6 45.1 24.2 33.9 38.0 5.0 41.4 50.2 8.5 51.5 89.9 76.4 74.0 44.8 42.0 64.0 80.8 34.9 19.7 75.0 87.4/85.7/66.4 19.6/53.3 71.9 3.12 0.45
C-MN-B4 96.7 83.2 89.2 91.9 98.2 66.7 67.7 66.3 91.9 77.2 57.8 94.4 88.0 77.0 56.6 78.5 77.3 85.6 80.5 44.2 73.7 89.6/88.0/71.1 30.3/65.0 65.8 3.54 0.46
D-MN-B4 91.5 67.0 61.4 44.4 57.2 41.8 12.1 41.2 80.6 25.1 68.0 90.7 74.6 74.3 50.3 61.7 74.2 81.9 57.0 29.3 89.3 94.6/92.6/76.5 14.0/43.8 73.1 3.05 0.40
S-MN-B4 83.5 57.2 68.3 70.8 85.8 52.9 25.9 52.8 81.6 17.7 56.1 91.3 83.6 74.5 49.0 55.2 68.0 84.3 61.0 27.4 78.7 89.5/87.9/71.4 19.4/53.0 79.6 3.06 0.41
C-MN-B-15 98.7 90.1 94.7 95.1 99.7 75.7 74.9 73.6 94.4 91.8 66.7 96.2 92.8 77.6 62.3 87.7 83.3 87.5 87.2 54.7 80.4 93.2/91.4/75.7 29.5/59.9 70.6 2.63 0.37
D-MN-B-15 92.2 67.9 69.0 33.9 59.5 45.4 13.8 46.3 82.0 26.6 65.4 90.1 79.1 76.0 53.2 63.7 74.4 83.3 62.2 33.7 89.4 95.8/94.4/80.1 10.5/42.4 77.2 2.72 0.37
Up-G R50 92.9 73.7 81.1 88.9 94.0 58.6 68.6 63.0 86.1 74.0 57.9 94.4 84.0 75.7 54.3 70.8 74.3 85.9 72.6 34.8 87.7 93.9/92.2/77.0 14.7/46.0 66.19 2.835 0.39
MN-B4 96.7 83.9 89.2 92.1 98.2 66.7 67.7 66.5 91.9 77.2 57.8 94.4 88.0 77.0 57.1 79 77.7 86 80.5 44.2 89.1 94.9/92.8/76.5 12.0/50.5 71.4 2.94 0.40
MN-B15 98.7 90.4 94.5 95.4 99.7 74.4 75.4 74.2 94.5 91.8 66.7 96.3 92.7 77.9 63.1 88 83.6 88 87.1 54.7 89.8 95.9/94.2/79.6 10.5/41.3 77.3 2.71 0.37
Code for the paper: Adversarial Machine Learning: Bayesian Perspectives

Code for the paper: Adversarial Machine Learning: Bayesian Perspectives This repository contains code for reproducing the experiments in the ** Advers

Roi Naveiro 2 Nov 11, 2022
Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet)

Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet) Our paper: https://arxiv.org/abs/2111.13324 We will release the complet

15 Oct 17, 2022
The first dataset of composite images with rationality score indicating whether the object placement in a composite image is reasonable.

Object-Placement-Assessment-Dataset-OPA Object-Placement-Assessment (OPA) is to verify whether a composite image is plausible in terms of the object p

BCMI 53 Nov 15, 2022
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (चित्र) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

Aniket Maurya 210 Dec 21, 2022
Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video

Bogireddy Sai Prasanna Teja Reddy 103 Dec 29, 2022
Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch.

SE3 Transformer - Pytorch Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. May be needed for replicating Alphafold2 resu

Phil Wang 207 Dec 23, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability an

Liang HUANG 87 Dec 28, 2022
BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

BBB Streamer NG? Makes a conference like this... ...streamable like this! I also recorded a small video showing the basic features: https://www.youtub

Lukas Schauer 60 Oct 21, 2022
CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

Yogi-Optimizer_Keras This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras) The NeurIPS-Paper can be found here: http://papers.nips.c

14 Sep 13, 2022
ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

ManipulaTHOR: A Framework for Visual Object Manipulation Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha

AI2 65 Dec 30, 2022
NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

NeuralWOZ This code is official implementation of "NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation". Sungdong Kim, Mi

NAVER AI 31 Oct 25, 2022
Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Source code for "UCD participation in TREC-IS 2020A, 2020B and 2021A". *** update at: 2021/05/25 This repo so far relates to the following work: Trans

Congcong Wang 4 Oct 19, 2021
Header-only library for using Keras models in C++.

frugally-deep Use Keras models in C++ with ease Table of contents Introduction Usage Performance Requirements and Installation FAQ Introduction Would

Tobias Hermann 927 Jan 05, 2023
Python inverse kinematics for your robot model based on Pinocchio.

Python inverse kinematics for your robot model based on Pinocchio.

Stéphane Caron 50 Dec 22, 2022
This is a file about Unet implemented in Pytorch

Unet this is an implemetion of Unet in Pytorch and it's architecture is as follows which is the same with paper of Unet component of Unet Convolution

Dragon 1 Dec 03, 2021
Lipschitz-constrained Unsupervised Skill Discovery

Lipschitz-constrained Unsupervised Skill Discovery This repository is the official implementation of Seohong Park, Jongwook Choi*, Jaekyeom Kim*, Hong

Seohong Park 17 Dec 18, 2022
Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

Less is More: Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification Suncheng Xiang Shanghai Jiao Tong University Over

SunchengXiang 68 Dec 13, 2022
Breaking the Dilemma of Medical Image-to-image Translation

Breaking the Dilemma of Medical Image-to-image Translation Supervised Pix2Pix and unsupervised Cycle-consistency are two modes that dominate the field

Kid Liet 86 Dec 21, 2022
A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Duplicate Image Detection Getting Started Install dependencies pip install -r requirements.txt Run service python main.py Testing Test with pytest How

Matthew Podolak 21 Nov 11, 2022