OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

Last update: Dec 13, 2022

Related tags

Deep Learning OrienMask

Overview

OrienMask

This repository implements the framework OrienMask for real-time instance segmentation.

It achieves 34.8 mask AP on COCO test-dev at the speed of 42.7 FPS evaluated with a single RTX 2080Ti. (log)

Paper: Real-time Instance Segmentation with Discriminative Orientation Maps

Installation

Please see INSTALL.md to prepare the environment and dataset.

Usage

Place the pre-trained backbone (link) and trained model (link) as follows for convenience (otherwise update the corresponding path in configurations):

├── checkpoints
│   ├── pretrained
│   │   ├──pretrained_darknet53.pth
│   ├── OrienMaskAnchor4FPNPlus
│   │   ├──orienmask_yolo.pth

train

Three items should be noticed when deploying different number of GPUs: n_gpu, batch_size, accumulate. Keep in mind that the approximate batch size equals to n_gpu * batch_size * accumulate.

# multi-gpu train (n_gpu=2, batch_size=8, accumulate=1)
# if necessary, set MASTER_PORT to avoid port conflict
# if permission error, run `chmod +x dist_train.sh`
CUDA_VISIBLE_DEVICES=0,1 ./dist_train.sh \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus

# single-gpu train (n_gpu=1, batch_size=8, accumulate=2)
CUDA_VISIBLE_DEVICES=0 ./dist_train.sh \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus
# or
CUDA_VISIBLE_DEVICES=0 python train.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus

test

Run the following command to obtain AP and AR metrics on val2017 split:

CUDA_VISIBLE_DEVICES=0 python test.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus_test \
    -w checkpoints/OrienMaskAnchor4FPNPlus/orienmask_yolo.pth

infer

Please run python infer.py -h for more usages.

# infer on an image and save the visualized result
CUDA_VISIBLE_DEVICES=0 python infer.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus_infer \
    -w checkpoints/OrienMaskAnchor4FPNPlus/orienmask_yolo.pth \
    -i assets/000000163126.jpg -v -o outputs

# infer on a list of images and save the visualized results
CUDA_VISIBLE_DEVICES=0 python infer.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus_infer \
    -w checkpoints/OrienMaskAnchor4FPNPlus/orienmask_yolo.pth \
    -d coco/test2017 -l assets/test_dev_selected.txt -v -o outputs

logs

We provide two types of logs for monitoring the training process. The first is updated on the terminal which is also stored in a train.log file in the checkpoint directory. The other is the tensorboard whose statistics are kept in the checkpoint directory.

Citation

@article{du2021realtime,
  title={Real-time Instance Segmentation with Discriminative Orientation Maps}, 
  author={Du, Wentao and Xiang, Zhiyu and Chen, Shuya and Qiao, Chengyu and Chen, Yiman and Bai, Tingming},
  journal={arXiv preprint arXiv:2106.12204},
  year={2021}
}

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

Related tags

Overview

OrienMask

Installation

Usage

train

test

infer

logs

Citation

Owner

PIXIE: Collaborative Regression of Expressive Bodies

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

Code for "Localization with Sampling-Argmax", NeurIPS 2021

Metric learning algorithms in Python

PyTorch code of paper "LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering"

An off-line judger supporting distributed problem repositories

Companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsura et al.

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

《DeepViT: Towards Deeper Vision Transformer》(2021)

Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Food recognition model using convolutional neural network & computer vision

A lossless neural compression framework built on top of JAX.

Use AI to generate a optimized stock portfolio

C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Interactive Image Generation via Generative Adversarial Networks

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks