Video Instance Segmentation with a Propose-Reduce Paradigm (ICCV 2021)

Overview

Propose-Reduce VIS

This repo contains the official implementation for the paper:

Video Instance Segmentation with a Propose-Reduce Paradigm

Huaijia Lin*, Ruizheng Wu*, Shu Liu, Jiangbo Lu, Jiaya Jia

ICCV 2021 | Paper

TeaserImage

Installation

Please refer to INSTALL.md.

Demo

You can compute the VIS results for your own videos.

  1. Download pretrained weight.
  2. Put example videos in 'demo/inputs'. We support two types of inputs, frames directories or .mp4 files (see example for details).
  3. Run the following script and obtain the results in demo/outputs.
sh demo.sh

Data Preparation

(1) Download the videos and jsons of val set from YouTube-VIS 2019

(2) Download the videos and jsons of val set from YouTube-VIS 2021

(3) Symlink the corresponding dataset and json files to the data folder

mkdir data
data
├── valset_ytv19 --> /path/to/ytv2019/vos/valid/JPEGImages/ 
├── valid_ytv19.json --> /path/to/ytv2019/vis/valid.json
├── valset_ytv21 --> /path/to/ytv2021/vis/valid/JPEGImages/ 
├── valid_ytv21.json --> /path/to/ytv2021/vis/valid/instances.json

Results

We provide the results of several pretrained models and corresponding scripts on different backbones. The results have slight differences from the paper because we make minor modifications to the inference codes.

Download the pretrained models and put them in pretrained folder.

mkdir pretrained
Dataset Method Backbone CA Reduce AP [email protected] download
YouTube-VIS 2019 Seq Mask R-CNN ResNet-50 40.8 49.9 model | scripts
YouTube-VIS 2019 Seq Mask R-CNN ResNet-50 42.5 56.8 scripts
YouTube-VIS 2019 Seq Mask R-CNN ResNet-101 43.8 52.7 model | scripts
YouTube-VIS 2019 Seq Mask R-CNN ResNet-101 45.2 59.0 scripts
YouTube-VIS 2019 Seq Mask R-CNN ResNeXt-101 47.6 56.7 model | scripts
YouTube-VIS 2019 Seq Mask R-CNN ResNeXt-101 48.8 62.2 scripts
YouTube-VIS 2021 Seq Mask R-CNN ResNet-50 39.6 47.5 model | scripts
YouTube-VIS 2021 Seq Mask R-CNN ResNet-50 41.7 54.9 scripts
YouTube-VIS 2021 Seq Mask R-CNN ResNeXt-101 45.6 52.9 model | scripts
YouTube-VIS 2021 Seq Mask R-CNN ResNeXt-101 47.2 57.6 scripts

Evaluation

YouTube-VIS 2019: A json file will be saved in `../Results_ytv19' folder. Please zip and upload to the codalab server.

YouTube-VIS 2021: A json file will be saved in `../Results_ytv21' folder. Please zip and upload to the codalab server.

TODOs

Citation

If you find this work useful in your research, please cite:

@article{lin2021video,
  title={Video Instance Segmentation with a Propose-Reduce Paradigm},
  author={Lin, Huaijia and Wu, Ruizheng and Liu, Shu and Lu, Jiangbo and Jia, Jiaya},
  booktitle={IEEE International Conference on Computer Vision (ICCV)},
  year={2021}
}

Contact

If you have any questions regarding the repo, please feel free to contact me ([email protected]) or create an issue.

Acknowledgments

This repo is based on MMDetection, MaskTrackRCNN, STM, MMCV and COCOAPI.

Owner
DV Lab
Deep Vision Lab
DV Lab
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark We propose a benchmark to evaluate different quantization algorithms on vari

494 Dec 29, 2022
Implementation of Segnet, FCN, UNet , PSPNet and other models in Keras.

Image Segmentation Keras : Implementation of Segnet, FCN, UNet, PSPNet and other models in Keras. Implementation of various Deep Image Segmentation mo

Divam Gupta 2.6k Jan 05, 2023
[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

Xiefan Guo 122 Dec 11, 2022
i3DMM: Deep Implicit 3D Morphable Model of Human Heads

i3DMM: Deep Implicit 3D Morphable Model of Human Heads CVPR 2021 (Oral) Arxiv | Poject Page This project is the official implementation our work, i3DM

Tarun Yenamandra 60 Jan 03, 2023
(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

Yichun Shi 66 Nov 29, 2022
基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
GNN-based Recommendation Benchma

GRecX A Fair Benchmark for GNN-based Recommendation Preliminary Comparison DiffNet-Yelp dataset (featureless) Algo 73 Oct 17, 2022

SeMask: Semantically Masked Transformers for Semantic Segmentation.

SeMask: Semantically Masked Transformers Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi This repo co

Picsart AI Research (PAIR) 186 Dec 30, 2022
Good Classification Measures and How to Find Them

Good Classification Measures and How to Find Them This repository contains supplementary materials for the paper "Good Classification Measures and How

Yandex Research 7 Nov 13, 2022
Thermal Control of Laser Powder Bed Fusion using Deep Reinforcement Learning

This repository is the implementation of the paper "Thermal Control of Laser Powder Bed Fusion Using Deep Reinforcement Learning", linked here. The project makes use of the Deep Reinforcement Library

BaratiLab 11 Dec 27, 2022
Writeups for the challenges from DownUnderCTF 2021

cloud Challenge Author Difficulty Release Round Bad Bucket Blue Alder easy round 1 Not as Bad Bucket Blue Alder easy round 1 Lost n Found Blue Alder m

DownUnderCTF 161 Dec 31, 2022
Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Face-Recognition-based-Attendance-System A real time implementation of Attendance System in python. Pre-requisites To understand the implentation of F

Muhammad Zain Ul Haque 1 Dec 31, 2021
Code for the paper "Next Generation Reservoir Computing"

Next Generation Reservoir Computing This is the code for the results and figures in our paper "Next Generation Reservoir Computing". They are written

OSU QuantInfo Lab 105 Dec 20, 2022
A simple Python library for stochastic graphical ecological models

What is Viridicle? Viridicle is a library for simulating stochastic graphical ecological models. It implements the continuous time models described in

Theorem Engine 0 Dec 04, 2021
Github Traffic Insights as Prometheus metrics.

github-traffic Github Traffic collects your repository's traffic data and exposes it as Prometheus metrics. Grafana dashboard that displays the metric

Grafana Labs 34 Oct 27, 2022
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Antoine Caillon 589 Jan 02, 2023
Keras Image Embeddings using Contrastive Loss

Keras-Image-Embeddings-using-Contrastive-Loss Image to Embedding projection in vector space. Implementation in keras and tensorflow for custom data. B

Shravan Anand K 5 Mar 21, 2022
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

RNNT loss in Pytorch - Numba JIT compiled (warprnnt_numba) Warp RNN Transducer Loss for ASR in Pytorch, ported from HawkAaron/warp-transducer and a re

Somshubra Majumdar 15 Oct 22, 2022
Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Description The source code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chin

Zhengxiang Wang 3 Jun 28, 2022