We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

Related tags

Deep LearningConTNet
Overview

ConTNet

Introduction

ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large receptive field, limiting the performance of ConvNets on downstream tasks. (2) Transformer-based model is not robust enough and requires special training settings or hundreds of millions of images as the pretrain dataset, thereby limiting their adoption. ConTNet combines convolution and transformer alternately, which is very robust and can be optimized like ResNet unlike the recently-proposed transformer-based models (e.g., ViT, DeiT) that are sensitive to hyper-parameters and need many tricks when trained from scratch on a midsize dataset (e.g., ImageNet).

Main Results on ImageNet

name resolution [email protected] #params(M) FLOPs(G) model
Res-18 224x224 71.5 11.7 1.8
ConT-S 224x224 74.9 10.1 1.5
Res-50 224x224 77.1 25.6 4.0
ConT-M 224x224 77.6 19.2 3.1
Res-101 224x224 78.2 44.5 7.6
ConT-B 224x224 77.9 39.6 6.4
DeiT-Ti* 224x224 72.2 5.7 1.3
ConT-Ti* 224x224 74.9 5.8 0.8
Res-18* 224x224 73.2 11.7 1.8
ConT-S* 224x224 76.5 10.1 1.5
Res-50* 224x224 78.6 25.6 4.0
DeiT-S* 224x224 79.8 22.1 4.6
ConT-M* 224x224 80.2 19.2 3.1
Res-101* 224x224 80.0 44.5 7.6
DeiT-B* 224x224 81.8 86.6 17.6
ConT-B* 224x224 81.8 39.6 6.4

Note: * indicates training with strong augmentations.

Main Results on Downstream Tasks

Object detection results on COCO.

method backbone #params(M) FLOPs(G) AP APs APm APl
RetinaNet Res-50
ConTNet-M
32.0
27.0
235.6
217.2
36.5
37.9
20.4
23.0
40.3
40.6
48.1
50.4
FCOS Res-50
ConTNet-M
32.2
27.2
242.9
228.4
38.7
40.8
22.9
25.1
42.5
44.6
50.1
53.0
faster rcnn Res-50
ConTNet-M
41.5
36.6
241.0
225.6
37.4
40.0
21.2
25.4
41.0
43.0
48.1
52.0

Instance segmentation results on Cityscapes based on Mask-RCNN.

backbone APbb APsbb APmbb APlbb APmk APsmk APmmk APlmk
Res-50
ConT-M
38.2
40.5
21.9
25.1
40.9
44.4
49.5
52.7
34.7
38.1
18.3
20.9
37.4
41.0
47.2
50.3

Semantic segmentation results on cityscapes.

model mIOU
PSP-Res50 77.12
PSP-ConTM 78.28

Bib Citing

@article{yan2021contnet,
    title={ConTNet: Why not use convolution and transformer at the same time?},
    author={Haotian Yan and Zhe Li and Weijian Li and Changhu Wang and Ming Wu and Chuang Zhang},
    year={2021},
    journal={arXiv preprint arXiv:2104.13497}
}
Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

PSS: Personalized Image Semantic Segmentation Paper PSS: Personalized Image Semantic Segmentation Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming

张宇 15 Jul 09, 2022
Problem-943.-ACMP - Problem 943. ACMP

Problem-943.-ACMP В "main.py" расположен вариант моего решения задачи 943 с серв

Konstantin Dyomshin 2 Aug 19, 2022
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 07, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 03, 2023
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
Deploy pytorch classification model using Flask and Streamlit

Deploy pytorch classification model using Flask and Streamlit

Ben Seo 1 Nov 17, 2021
Cross View SLAM

Cross View SLAM This is the associated code and dataset repository for our paper I. D. Miller et al., "Any Way You Look at It: Semantic Crossview Loca

Ian D. Miller 99 Dec 09, 2022
SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.

SOLO: Segmenting Objects by Locations This project hosts the code for implementing the SOLO algorithms for instance segmentation. SOLO: Segmenting Obj

Xinlong Wang 1.5k Dec 31, 2022
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023
Code for project: "Learning to Minimize Remainder in Supervised Learning".

Learning to Minimize Remainder in Supervised Learning Code for project: "Learning to Minimize Remainder in Supervised Learning". Requirements and Envi

Yan Luo 0 Jul 18, 2021
Colar: Effective and Efficient Online Action Detection by Consulting Exemplars, CVPR 2022.

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars This repository is the official implementation of Colar. In this work,

LeYang 246 Dec 13, 2022
TransCD: Scene Change Detection via Transformer-based Architecture

TransCD: Scene Change Detection via Transformer-based Architecture

wangzhixue 29 Dec 11, 2022
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
Object Detection Projekt in GKI WS2021/22

tfObjectDetection Object Detection Projekt with tensorflow in GKI WS2021/22 Docker Container: docker run -it --name --gpus all -v path/to/project:p

Tim Eggers 1 Jul 18, 2022
Meandering In Networks of Entities to Reach Verisimilar Answers

MINERVA Meandering In Networks of Entities to Reach Verisimilar Answers Code and models for the paper Go for a Walk and Arrive at the Answer - Reasoni

Shehzaad Dhuliawala 271 Dec 13, 2022
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Kaggle: Cell Instance Segmentation

Kaggle: Cell Instance Segmentation The goal of this challenge is to detect cells in microscope images. with simple view on how many cels have been ann

Jirka Borovec 9 Aug 12, 2022
This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

Auto-Lambda This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationship

Shikun Liu 76 Dec 20, 2022