GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Overview

GCNet for Object Detection

PWC PWC PWC PWC

By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.

This repo is a official implementation of "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection based on open-mmlab's mmdetection. The core operator GC block could be find here. Many thanks to mmdetection for their simple and clean framework.

Update on 2020/12/07

The extension of GCNet got accepted by TPAMI (PDF).

Update on 2019/10/28

GCNet won the Best Paper Award at ICCV 2019 Neural Architects Workshop!

Update on 2019/07/01

The code is refactored. More results are provided and all configs could be found in configs/gcnet.

Notes: Both PyTorch official SyncBN and Apex SyncBN have some stability issues. During training, mAP may drops to zero and back to normal during last few epochs.

Update on 2019/06/03

GCNet is supported by the official mmdetection repo here. Thanks again for open-mmlab's work on open source projects.

Introduction

GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.

Citing GCNet

@article{cao2019GCNet,
  title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
  author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
  journal={arXiv preprint arXiv:1904.11492},
  year={2019}
}

Main Results

Results on R50-FPN with backbone (fixBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask fixBN 2fc (w/o BN) - 1x 3.9 0.453 10.6 37.3 34.2 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.533 10.1 38.5 35.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.533 9.9 38.9 35.5 model
R50-FPN Mask fixBN 2fc (w/o BN) - 2x - - - 38.2 34.9 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 2x - - - 39.7 36.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 2x - - - 40.0 36.2 model

Results on R50-FPN with backbone (syncBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask SyncBN 2fc (w/o BN) - 1x 3.9 0.543 10.2 37.2 33.8 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.547 9.9 39.4 35.7 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.603 9.4 39.9 36.2 model
R50-FPN Mask SyncBN 2fc (w/o BN) - 2x 3.9 0.543 10.2 37.7 34.3 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 2x 4.5 0.547 9.9 39.7 36.0 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 2x 4.6 0.603 9.4 40.2 36.3 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) - 1x - - - 38.8 34.6 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r16) 1x - - - 41.0 36.5 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r4) 1x - - - 41.4 37.0 model

Results on stronger backbones

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R101-FPN Mask fixBN 2fc (w/o BN) - 1x 5.8 0.571 9.5 39.4 35.9 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.731 8.6 40.8 37.0 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.747 8.6 40.8 36.9 model
R101-FPN Mask SyncBN 2fc (w/o BN) - 1x 5.8 0.665 9.2 39.8 36.0 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.778 9.0 41.1 37.4 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.786 8.9 41.7 37.6 model
X101-FPN Mask SyncBN 2fc (w/o BN) - 1x 7.1 0.912 8.5 41.2 37.3 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 8.2 1.055 7.7 42.4 38.0 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 8.3 1.037 7.6 42.9 38.5 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 44.7 38.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 45.9 39.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 46.5 39.7 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 47.1 40.4 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 47.9 40.9 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 47.9 40.8 model

Notes

  • GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
  • DCN denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5 stages of backbone.
  • r4 and r16 denote ratio 4 and ratio 16 in GC block respectively.
  • Some of models are trained on 4 GPUs with 4 images on each GPU.

Requirements

  • Linux(tested on Ubuntu 16.04)
  • Python 3.6+
  • PyTorch 1.1.0
  • Cython
  • apex (Sync BN)

Install

a. Install PyTorch 1.1 and torchvision following the official instructions.

b. Install latest apex with CUDA and C++ extensions following this instructions. The Sync BN implemented by apex is required.

c. Clone the GCNet repository.

 git clone https://github.com/xvjiarui/GCNet.git 

d. Compile cuda extensions.

cd GCNet
pip install cython  # or "conda install cython" if you prefer conda
./compile.sh  # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments

e. Install GCNet version mmdetection (other dependencies will be installed automatically).

python(3) setup.py install  # add --user if you want to install it locally
# or "pip install ."

Note: You need to run the last step each time you pull updates from github. Or you can run python(3) setup.py develop or pip install -e . to install mmdetection if you want to make modifications to it frequently.

Please refer to mmdetection install instruction for more details.

Environment

Hardware

  • 8 NVIDIA Tesla V100 GPUs
  • Intel Xeon 4114 CPU @ 2.20GHz

Software environment

  • Python 3.6.7
  • PyTorch 1.1.0
  • CUDA 9.0
  • CUDNN 7.0
  • NCCL 2.3.5

Usage

Train

As in original mmdetection, distributed training is recommended for either single machine or multiple machines.

./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]

Supported arguments are:

  • --validate: perform evaluation every k (default=1) epochs during the training.
  • --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Evaluation

To evaluate trained models, output file is required.

python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]

Supported arguments are:

  • --gpus: number of GPU used for evaluation
  • --out: output file name, usually ends wiht .pkl
  • --eval: type of evaluation need, for mask-rcnn, bbox segm would evaluate both bounding box and mask AP.
Owner
Jerry Jiarui XU
Part of the journey is the end
Jerry Jiarui XU
Motion Reconstruction Code and Data for Skills from Videos (SFV)

Motion Reconstruction Code and Data for Skills from Videos (SFV) This repo contains the data and the code for motion reconstruction component of the S

268 Dec 01, 2022
Official repository for the paper "Instance-Conditioned GAN"

Official repository for the paper "Instance-Conditioned GAN" by Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michał Drożdżal, Adriana Romero-Soriano.

Facebook Research 510 Dec 30, 2022
Learning Versatile Neural Architectures by Propagating Network Codes

Learning Versatile Neural Architectures by Propagating Network Codes Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang,

Mingyu Ding 36 Dec 06, 2022
MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

Lightweight-Detection-and-KD MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet. This repo also includes detection knowledge di

Egqawkq 12 Jan 05, 2023
Noether Networks: meta-learning useful conserved quantities

Noether Networks: meta-learning useful conserved quantities This repository contains the code necessary to reproduce experiments from "Noether Network

Dylan Doblar 33 Nov 23, 2022
Pytorch implementation of "Neural Wireframe Renderer: Learning Wireframe to Image Translations"

Neural Wireframe Renderer: Learning Wireframe to Image Translations Pytorch implementation of ideas from the paper Neural Wireframe Renderer: Learning

Yuan Xue 7 Nov 14, 2022
SegNet-Basic with Keras

SegNet-Basic: What is Segnet? Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-wise Image Segmentation Segnet = (Encoder + Decoder)

Yad Konrad 81 Jun 30, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 09, 2022
Learning Saliency Propagation for Semi-supervised Instance Segmentation

Learning Saliency Propagation for Semi-supervised Instance Segmentation PyTorch Implementation This repository contains: the PyTorch implementation of

Berkeley DeepDrive 68 Oct 18, 2022
Human Pose estimation with TensorFlow framework

Human Pose Estimation with TensorFlow Here you can find the implementation of the Human Body Pose Estimation algorithm, presented in the DeeperCut and

Eldar Insafutdinov 1.1k Dec 29, 2022
PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

D2C: Diffuison-Decoding Models for Few-shot Conditional Generation Project | Paper PyTorch implementation of D2C: Diffuison-Decoding Models for Few-sh

Jiaming Song 90 Dec 27, 2022
Apply AnimeGAN-v2 across frames of a video clip

title emoji colorFrom colorTo sdk app_file pinned AnimeGAN-v2 For Videos 🔥 blue red gradio app.py false AnimeGAN-v2 For Videos Apply AnimeGAN-v2 acro

Nathan Raw 36 Oct 18, 2022
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Auto-Seg-Loss By Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai This is the official implementation of the ICLR 2021 paper Auto

61 Dec 21, 2022
Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Gabriel Huang 70 Jan 07, 2023
[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration Introduction The repository contains the source code and pre-tr

Intelligent Sensing, Perception and Computing Group 55 Dec 14, 2022
Implementation of Convolutional enhanced image Transformer

CeiT : Convolutional enhanced image Transformer This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transfor

Rishikesh (ऋषिकेश) 82 Dec 13, 2022
Automatic differentiation with weighted finite-state transducers.

GTN: Automatic Differentiation with WFSTs Quickstart | Installation | Documentation What is GTN? GTN is a framework for automatic differentiation with

100 Dec 29, 2022
Unsupervised Image Generation with Infinite Generative Adversarial Networks

Unsupervised Image Generation with Infinite Generative Adversarial Networks Here is the implementation of MICGANs using DCGAN architecture on MNIST da

16 Dec 24, 2021
本步态识别系统主要基于GaitSet模型进行实现

本步态识别系统主要基于GaitSet模型进行实现。在尝试部署本系统之前,建立理解GaitSet模型的网络结构、训练和推理方法。 系统的实现效果如视频所示: 演示视频 由于模型较大,部分模型文件存储在百度云盘。 链接提取码:33mb 具体部署过程 1.下载代码 2.安装requirements.txt

16 Oct 22, 2022
Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Poisson-solver-2D Finite difference solution of 2D Poisson equation Current version can handle Dirichlet, Neumann, and mixed (combination of Dirichlet

Mohammad Asif Zaman 34 Dec 23, 2022