Implementation of "A MLP-like Architecture for Dense Prediction"

Related tags

Deep LearningCycleMLP
Overview

A MLP-like Architecture for Dense Prediction (arXiv)

License: MIT Python 3.8

    

Updates

  • (22/07/2021) Initial release.

Model Zoo

We provide CycleMLP models pretrained on ImageNet 2012.

Model Parameters FLOPs Top 1 Acc. Download
CycleMLP-B1 15M 2.1G 78.9% model
CycleMLP-B2 27M 3.9G 81.6% model
CycleMLP-B3 38M 6.9G 82.4% model
CycleMLP-B4 52M 10.1G 83.0% model
CycleMLP-B5 76M 12.3G 83.2% model

Usage

Install

  • PyTorch 1.7.0+ and torchvision 0.8.1+
  • timm:
pip install 'git+https://github.com/rwightman/[email protected]'

or

git clone https://github.com/rwightman/pytorch-image-models
cd pytorch-image-models
git checkout c2ba229d995c33aaaf20e00a5686b4dc857044be
pip install -e .
  • fvcore (optional, for FLOPs calculation)
  • mmcv, mmdetection, mmsegmentation (optional)

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is:

│path/to/imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

Evaluation

To evaluate a pre-trained CycleMLP-B5 on ImageNet val with a single GPU run:

python main.py --eval --model CycleMLP_B5 --resume path/to/CycleMLP_B5.pth --data-path /path/to/imagenet

Training

To train CycleMLP-B5 on ImageNet on a single node with 8 gpus for 300 epochs run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model CycleMLP_B5 --batch-size 128 --data-path /path/to/imagenet --output_dir /path/to/save

Acknowledgement

This code is based on DeiT and pytorch-image-models. Thanks for their wonderful works

Citing

@article{chen2021cyclemlp,
  title={CycleMLP: A MLP-like Architecture for Dense Prediction},
  author={Chen, Shoufa and Xie, Enze and Ge, Chongjian and Liang, Ding and Luo, Ping},
  journal={arXiv preprint arXiv:2107.10224},
  year={2021}
}

License

CycleMLP is released under MIT License.

Comments
  • detection result

    detection result

    Applying PVT detection framework, I tried a CycleMLP-B1 based detector with RetinaNet 1x. I got AP=27.1, fairly inferior to the reported 38.6. Could you give some advices to reproduce the reported result?

    The specific configure is as follows

    base = [ 'base/models/retinanet_r50_fpn.py', 'base/datasets/coco_detection.py', 'base/schedules/schedule_1x.py', 'base/default_runtime.py' ] #optimizer model = dict( pretrained='./pretrained/CycleMLP_B1.pth', backbone=dict( type='CycleMLP_B1_feat', style='pytorch'), neck=dict( type='FPN', in_channels=[64, 128, 320, 512], out_channels=256, start_level=1, add_extra_convs='on_input', num_outs=5)) #optimizer optimizer = dict(delete=True, type='AdamW', lr=0.0001, weight_decay=0.0001) optimizer_config = dict(grad_clip=None)

    find_unused_parameters = True

    opened by mountain111 6
  • Compiling CycleMLP

    Compiling CycleMLP

    Thank you for this great repo and interesting paper.

    I tried compiling CycleMLP to onnx and not surpassingly the process failed since CycleMLP include dynamic offset creation in https://github.com/ShoufaChen/CycleMLP/blob/main/cycle_mlp.py#L132 and as such cannot be converted to a frozen graph. Were you able to convert CycleMLP to onnx or any other frozen graph framework?

    Thanks in advance.

    opened by shairoz-deci 6
  • Questions about offset calculation

    Questions about offset calculation

    Hi, thanks for your wonderful work.

    I'm currently studying your work, and come up with some question about the offset calculations.

    I understood the offset calculation mentioned on the paper, but can't understand about how generated offset is being used in the code.

    For ex) if $S_H \times S_W : 3 \times 1$; I understood how the offset is applied in this figure 스크린샷 2022-06-13 오후 9 18 20

    by calculate like this: 스크린샷 2022-06-13 오후 9 19 57

    However, when I run the offset generating code, I can't figure out how this offset is being used in deform_conv2d 스크린샷 2022-06-13 오후 9 21 57

    Can you provide more detailed information about this??

    And also, the paper contains how $S_H \times S_W: 3 \times 3$ works, but in the code, it seems like either one ofkernel_size[0] or kernel_size[1] has to be 1. So, if I want to use $S_H \times S_W : 3 \times 3$, do I have to make $3 \times 1$ and $1 \times 3$ offsets and add those together?

    Thank you again for your work. I really learned a lot.

    opened by tae-mo 5
  • Example of CycleMLP Configuration for Dense Prediction

    Example of CycleMLP Configuration for Dense Prediction

    Hello.

    First of all, thank you for curating this interesting work. I was wondering, are there any working examples of how I can use CycleMLP for dense prediction while maintaining the original input size (e.g., predict a 0 or 1 value for each pixel in an input image)? In addition, I am interested in only a single ("annotated") output image, although I noticed the model definitions given in this repository output multiple downsampled versions of the original input image. Any thoughts on this?

    Thank you in advance for your time.

    opened by amorehead 2
  • Swin-B vs CycleMLP-B on image classification

    Swin-B vs CycleMLP-B on image classification

    For classificaion on ImageNet-1k, the acuracy of Swin-B is 83.5, which is 0.1 higher than the proposed CycleMLP-B. But, in this paper, the authors reprot that the accuracy of Swin-B is 83.3, which is 0.1 lower than the proposed CycleMLP-B. Why are these accuracies different?

    opened by hkzhang91 1
  • question about the offset

    question about the offset

    Thanks for your work!

    The implementation of this code inspired me. But the calculation of offset here is confusing. Although this issue (https://github.com/ShoufaChen/CycleMLP/issues/10) has asked similar questions, I haven't found a reasonable explanation.

    https://github.com/ShoufaChen/CycleMLP/blob/2f76a1f6e3cc6672143fdac46e3db5f9a7341253/cycle_mlp.py#L127-L136

    kernel_size = (1, 3)
    start_idx = (kernel_size[0] * kernel_size[1]) // 2
    for i in range(num_channels):
        offset[0, 2 * i + 0, 0, 0] = 0
        # relative offset
        offset[0, 2 * i + 1, 0, 0] = (i + start_idx) % kernel_size[1] - (kernel_size[1] // 2)
    offset.reshape(num_channels, 2)
    
    tensor([[ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.],
            [ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.]])
    

    the results are different with the figure in paper:

    image

    Some codes for verification:

    import torch
    from torchvision.ops import deform_conv2d
    
    num_channels = 6
    
    data = torch.arange(1, 6).reshape(1, 1, 1, 5).expand(-1, num_channels, -1, -1)
    data
    """
    tensor([[[[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]]]])
    """
    
    weight = torch.eye(num_channels).reshape(num_channels, num_channels, 1, 1)
    weight.reshape(num_channels, num_channels)
    """
    tensor([[1., 0., 0., 0., 0., 0.],
            [0., 1., 0., 0., 0., 0.],
            [0., 0., 1., 0., 0., 0.],
            [0., 0., 0., 1., 0., 0.],
            [0., 0., 0., 0., 1., 0.],
            [0., 0., 0., 0., 0., 1.]])
    """
    
    offset = torch.empty(1, 2 * num_channels * 1 * 1, 1, 1)
    kernel_size = (1, 3)
    start_idx = (kernel_size[0] * kernel_size[1]) // 2
    for i in range(num_channels):
        offset[0, 2 * i + 0, 0, 0] = 0
        # relative offset
        offset[0, 2 * i + 1, 0, 0] = (
            (i + start_idx) % kernel_size[1] - (kernel_size[1] // 2)
        )
    offset.reshape(num_channels, 2)
    """
    tensor([[ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.],
            [ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.]])
    """
    
    deform_conv2d(
        data.float(), 
        offset=offset.expand(-1, -1, -1, 5).float(), 
        weight=weight.float(), 
        bias=None,
    )
    """
    tensor([[[[1., 2., 3., 4., 5.]],
             [[2., 3., 4., 5., 0.]],
             [[0., 1., 2., 3., 4.]],
             [[1., 2., 3., 4., 5.]],
             [[2., 3., 4., 5., 0.]],
             [[0., 1., 2., 3., 4.]]]])
    """
    
    opened by lartpang 1
  • question about the offset

    question about the offset

    Hi, thank you very much for your excellent work. In Fig.4 of your paper, you show the pseudo-kernel when kernel size is 1x3. But I when I find that function "gen_offset" does not generate the same offset as Fig.4. The offset it generates is "0,1,0,-1,0,0,0,1..." instead of "0,1,0,-1,0,1,0,-1', which is shown in Fig.4. So could you please tell me the reason? image image

    opened by linjing7 1
  • About

    About "crop_pct"

    Hi, thanks for your great work and code. I wonder the parameter crop_pct actually works in which part of code. When I go throught the timm, I can't find out how this crop_pct is loaded.

    opened by ggjy 1
  • How to deploy CycleMLP-T for training?

    How to deploy CycleMLP-T for training?

    Thank you very much for such a wonderful work!

    After learning the cycle_mlp source code in the repository, I am very confused to deploy CycleMLP Block based on Swin Transformer. Is it convenient for you to release swin-based CycleMLP? Looking forward to your reply, Thanks!

    opened by Pak287 0
Owner
Shoufa Chen
Shoufa Chen
The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

AICITY2021_Track2_DMT The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop. Introduction

Hao Luo 91 Dec 21, 2022
Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: [email pro

Ziheng Wang 44 Nov 30, 2022
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Pretrained Language Model This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei N

HUAWEI Noah's Ark Lab 2.6k Jan 01, 2023
Official PyTorch implementation of RIO

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection Figure 1: Our proposed Resampling at image-level and obect-

NVIDIA Research Projects 17 May 20, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022
It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

CLIP-ONNX It is a simple library to speed up CLIP inference up to 3x (K80 GPU) Usage Install clip-onnx module and requirements first. Use this trick !

Gerasimov Maxim 93 Dec 20, 2022
Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering Pytorch implementation of our method for regularizing nerual radiance fields f

106 Jan 06, 2023
Official PyTorch implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

U-GAT-IT — Official PyTorch Implementation : Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Imag

Hyeonwoo Kang 2.4k Jan 04, 2023
GRF: Learning a General Radiance Field for 3D Representation and Rendering

GRF: Learning a General Radiance Field for 3D Representation and Rendering [Paper] [Video] GRF: Learning a General Radiance Field for 3D Representatio

Alex Trevithick 243 Dec 29, 2022
Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

[AAAI2022] UCTransNet This repo is the official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspectiv

Haonan Wang 199 Jan 03, 2023
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
CUAD

Contract Understanding Atticus Dataset This repository contains code for the Contract Understanding Atticus Dataset (CUAD), a dataset for legal contra

The Atticus Project 273 Dec 17, 2022
An open source bike computer based on Raspberry Pi Zero (W, WH) with GPS and ANT+. Including offline map and navigation.

Pi Zero Bikecomputer An open-source bike computer based on Raspberry Pi Zero (W, WH) with GPS and ANT+ https://github.com/hishizuka/pizero_bikecompute

hishizuka 264 Jan 02, 2023
Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training Code for our paper "Predicting lncRNA–protein interactio

zhanglabNKU 1 Nov 29, 2022
(NeurIPS 2021) Realistic Evaluation of Transductive Few-Shot Learning

Realistic evaluation of transductive few-shot learning Introduction This repo contains the code for our NeurIPS 2021 submitted paper "Realistic evalua

Olivier Veilleux 14 Dec 13, 2022
The code of NeurIPS 2021 paper "Scalable Rule-Based Representation Learning for Interpretable Classification".

Rule-based Representation Learner This is a PyTorch implementation of Rule-based Representation Learner (RRL) as described in NeurIPS 2021 paper: Scal

Zhuo Wang 53 Dec 17, 2022
ML-Ensemble – high performance ensemble learning

A Python library for high performance ensemble learning ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framew

Sebastian Flennerhag 764 Dec 31, 2022
FSL-Mate: A collection of resources for few-shot learning (FSL).

FSL-Mate is a collection of resources for few-shot learning (FSL). In particular, FSL-Mate currently contains FewShotPapers: a paper list which tracks

Yaqing Wang 1.5k Jan 08, 2023
Simultaneous NMT/MMT framework in PyTorch

This repository includes the codes, the experiment configurations and the scripts to prepare/download data for the Simultaneous Machine Translation wi

<a href=[email protected]"> 37 Sep 29, 2022
Codebase for INVASE: Instance-wise Variable Selection - 2019 ICLR

Codebase for "INVASE: Instance-wise Variable Selection" Authors: Jinsung Yoon, James Jordon, Mihaela van der Schaar Paper: Jinsung Yoon, James Jordon,

Jinsung Yoon 50 Nov 11, 2022