SeMask: Semantically Masked Transformers for Semantic Segmentation.

Overview

SeMask: Semantically Masked Transformers

Framework: PyTorch

Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation.

semask

Contents

  1. Results
  2. Setup Instructions
  3. Citing SeMask

1. Results

Note: † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.

ADE20K

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 42.11 43.16 35M config TBD
SeMask-S FPN SeMask Swin-S 512x512 45.92 47.63 56M config TBD
SeMask-B FPN SeMask Swin-B 512x512 49.35 50.98 96M config TBD
SeMask-L FPN SeMask Swin-L 640x640 51.89 53.52 211M config TBD
SeMask-L MaskFormer SeMask Swin-L 640x640 54.75 56.15 219M config TBD
SeMask-L Mask2Former SeMask Swin-L 640x640 56.41 57.52 222M config TBD
SeMask-L Mask2Former FAPN SeMask Swin-L 640x640 56.68 58.00 227M config TBD
SeMask-L Mask2Former MSFAPN SeMask Swin-L 640x640 56.54 58.22 224M config TBD

Cityscapes

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 768x768 74.92 76.56 34M config TBD
SeMask-S FPN SeMask Swin-S 768x768 77.13 79.14 56M config TBD
SeMask-B FPN SeMask Swin-B 768x768 77.70 79.73 96M config TBD
SeMask-L FPN SeMask Swin-L 768x768 78.53 80.39 211M config TBD
SeMask-L Mask2Former SeMask Swin-L 512x1024 83.97 84.98 222M config TBD

COCO-Stuff 10k

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 37.53 38.88 35M config TBD
SeMask-S FPN SeMask Swin-S 512x512 40.72 42.27 56M config TBD
SeMask-B FPN SeMask Swin-B 512x512 44.63 46.30 96M config TBD
SeMask-L FPN SeMask Swin-L 640x640 47.47 48.54 211M config TBD

demo

2. Setup Instructions

We provide the codebase with SeMask incorporated into various models. Please check the setup instructions inside the corresponding folders:

3. Citing SeMask

@article{jain2022semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv preprint arXiv:...},
  year={2022}
}

Acknowledgements

Code is based heavily on the following repositories: Swin-Transformer-Semantic-Segmentation, Mask2Former, MaskFormer and FaPN-full.

Owner
Picsart AI Research (PAIR)
Picsart AI Research (PAIR)
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

PyGOD Team 757 Jan 04, 2023
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

AugMix Introduction We propose AugMix, a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented ima

Google Research 876 Dec 17, 2022
Simulation-based performance analysis of server-less Blockchain-enabled Federated Learning

Blockchain-enabled Server-less Federated Learning Repository containing the files used to reproduce the results of the publication "Blockchain-enabled

Francesc Wilhelmi 9 Sep 27, 2022
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation Introduction In this work, we propose a new method

NVIDIA Research Projects 132 Dec 13, 2022
Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection (NimPme) The official implementation of Novel Instances Mining with

12 Sep 08, 2022
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Meta Research 663 Jan 06, 2023
用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本和PARL(paddle)版本

用强化学习玩合成大西瓜 代码地址:https://github.com/Sharpiless/play-daxigua-using-Reinforcement-Learning 用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本、PARL(paddle)版本和pytorch版本

72 Dec 17, 2022
Spatial color quantization in Rust

rscolorq Rust port of Derrick Coetzee's scolorq, based on the 1998 paper "On spatial quantization of color images" by Jan Puzicha, Markus Held, Jens K

Collyn O'Kane 37 Dec 22, 2022
Intro-to-dl - Resources for "Introduction to Deep Learning" course.

Introduction to Deep Learning course resources https://www.coursera.org/learn/intro-to-deep-learning Running on Google Colab (tested for all weeks) Go

Advanced Machine Learning specialisation by HSE 761 Dec 24, 2022
Optimizers-visualized - Visualization of different optimizers on local minimas and saddle points.

Optimizers Visualized Visualization of how different optimizers handle mathematical functions for optimization. Contents Installation Usage Functions

Gautam J 1 Jan 01, 2022
3.8% and 18.3% on CIFAR-10 and CIFAR-100

Wide Residual Networks This code was used for experiments with Wide Residual Networks (BMVC 2016) http://arxiv.org/abs/1605.07146 by Sergey Zagoruyko

Sergey Zagoruyko 1.2k Dec 29, 2022
[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

TransMaS This repository is the official pytorch implementation of the following paper: NIPS2021 Mixed Supervised Object Detection by TransferringMask

BCMI 49 Jul 27, 2022
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 19, 2021
Real-Time Social Distance Monitoring tool using Computer Vision

Social Distance Detector A Real-Time Social Distance Monitoring Tool Table of Contents Motivation YOLO Theory Detection Output Tech Stack Functionalit

Pranav B 13 Oct 14, 2022
TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

2.6k Jan 04, 2023
Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch

CutPaste CutPaste: image from paper Unofficial implementation of Google's "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization"

Lilit Yolyan 59 Nov 27, 2022
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021). RTS3D is efficiency and accuracy s

71 Nov 29, 2022
Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

A project for counting vehicles using YOLOv4 + DeepSORT + Flask + Ngrok

Duong Tran Thanh 37 Dec 16, 2022
Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022