The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

Related tags

Deep LearningRegSeg
Overview

RegSeg

The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

Paper: arxiv

params

D block

DBlock

Decoder

Decoder

Setup

Install the dependencies in requirements.txt by using pip and virtualenv.

Download Cityscapes

go to https://www.cityscapes-dataset.com, create an account, and download gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip. You can delete the test images to save some space if you don't want to submit to the competition. Name the directory cityscapes_dataset. Make sure that you have downloaded the required python packages and run

CITYSCAPES_DATASET=cityscapes_dataset csCreateTrainIdLabelImgs

There are 19 classes.

Results from paper

To see the ablation studies results from the paper, go here.

Usage

To visualize your model, go to show.py. To train, validate, benchmark, and save the results of your model, go to train.py.

Results on Cityscapes server

RegSeg (exp48_decoder26, 30FPS): 78.3

Larger RegSeg (exp53_decoder29, 20 FPS): 79.5

Citation

If you find our work helpful, please consider citing our paper.

@article{gao2021rethink,
  title={Rethink Dilated Convolution for Real-time Semantic Segmentation},
  author={Gao, Roland},
  journal={arXiv preprint arXiv:2111.09957},
  year={2021}
}
Comments
  • question about STDC2-Seg75

    question about STDC2-Seg75

    Hi, I note that you benchmark the computation of STDC2-Seg75 which is not reported in the CVPR2021 paper. Did you test the speed of STDC-Seg on your own platform? How about the results?

    opened by ydhongHIT 2
  • Can not show.py

    Can not show.py

    I try show.py. But I can not.

    $ python3 show.py
    name= cityscapes
    train size: 2975
    val size: 500
    Traceback (most recent call last):
      File "show.py", line 358, in <module>
        show_cityscapes_model()
      File "show.py", line 337, in show_cityscapes_model
        show(model,val_loader,device,show_cityscapes_mask,num_images=num_images,skip=skip,images_per_line=images_per_line)
      File "show.py", line 134, in show
        outputs = model(images)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/sounansu/RegSeg/model.py", line 76, in forward
        x=self.stem(x)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/sounansu/RegSeg/blocks.py", line 22, in forward
        x = self.conv(x)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
    RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
    
    opened by sounansu 2
  • The pretrained model link

    The pretrained model link

    Hi, thank you for sharing the code. Can you provide download link about the pretrained model(exp48_decoder26 and exp53_decoder29) in Cityscapes dataset, Thank you very much!

    opened by gaowq2017 1
  • About train bug

    About train bug

    When using seg_transforms.py through your scripts 'camvid_efficientnet_b1_hyperseg-s', there always exsist 'TypeError: resize() got an unexpected keyword argument 'interpolation'' in 174 line. Does this bug only appear in this scripts and should I modify the code when using this scripts?

    opened by 870572761 0
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • About train code

    About train code

    When training, how did the miou and accuracy calculate? On train dataset or validate dataset? I think it's calculated on val dataset due to https://github.com/RolandGao/RegSeg/blob/main/train.py#L238. I trained the base regseg model with config cityscapes_trainval_1000epochs.yam on Cityscapes and got the unbelievable results. 840794c66f23deb33666dcffc4af5b5

    opened by Asthestarsfalll 6
  • confusion on field of view  and model inference time

    confusion on field of view and model inference time

    Hi, RolandGao, nice to see a good job! I see you've done a lot of experiments on the backbone setting, but I still have some confusion after reading your published paper.

    • First, You calculate the fov of 4095 to see the bottom-right pixel when training cityscape (1024x2048), so you have verify the backbone should be exp48 [ (1,1) + (1,2) + 4 * (1, 4) + 7 *(1, 14) ] with fov (3807). But I also find the same backbone when training the CamVid (720x960). Why not use a shallow backbone? I am training my own dataset with image resolution (512 x 512), do I need to modify the backbone architecture? Can you give some advice?
    • Second, I test inference time of regseg. I notice that the speed is not better than other real-time archs due to split and dilated conv even if model costs low GFLOPs. In the application, what we are concerned about is the speed, so is there any strategy to improve the speed?
    opened by LinaShanghaitech 5
  • Why not pretrain on ImageNet?

    Why not pretrain on ImageNet?

    Hi, Thanks for your excellent work ! I notice that RegSeg can achieve a high accuracy on Cityscapes without pretraining. I also did a lot of ablation studies and I think DDRNet will drop around 3% miou if they do not use ImageNet pretraining. How about trying to train your encoder on ImageNet and see what will happen? I really look forward to your result ! Thanks !

    opened by RobinhoodKi 1
Owner
Roland
University of Toronto CS 2023
Roland
A python comtrade load library accelerated by go

Comtrade-GRPC Code for python used is mainly from dparrini/python-comtrade. Just patch the code in BinaryDatReader.parse for parsing a little more eff

Bo 1 Dec 27, 2021
Official repository for Natural Image Matting via Guided Contextual Attention

GCA-Matting: Natural Image Matting via Guided Contextual Attention The source codes and models of Natural Image Matting via Guided Contextual Attentio

Li Yaoyi 349 Dec 26, 2022
Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

11 Nov 23, 2022
ICLR2021 (Under Review)

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning This repository contains the official PyTorch implementation o

Haoyi Fan 58 Dec 30, 2022
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

Hanxun Huang 11 Nov 30, 2022
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of

Wodson de Andrade 8 Dec 24, 2022
Generative Flow Networks for Discrete Probabilistic Modeling

Energy-based GFlowNets Code for Generative Flow Networks for Discrete Probabilistic Modeling by Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Vo

Narsil-Dinghuai Zhang 51 Dec 20, 2022
A PyTorch library and evaluation platform for end-to-end compression research

CompressAI CompressAI (compress-ay) is a PyTorch library and evaluation platform for end-to-end compression research. CompressAI currently provides: c

InterDigital 680 Jan 06, 2023
Social Distancing Detector

Computer vision has opened up a lot of opportunities to explore into AI domain that were earlier highly limited. Here is an application of haarcascade classifier and OpenCV to develop a social distan

Ashish Pandey 2 Jul 18, 2022
使用yolov5训练自己数据集(详细过程)并通过flask部署

使用yolov5训练自己的数据集(详细过程)并通过flask部署 依赖库 torch torchvision numpy opencv-python lxml tqdm flask pillow tensorboard matplotlib pycocotools Windows,请使用 pycoc

HB.com 19 Dec 28, 2022
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
MTCNN face detection implementation for TensorFlow, as a PIP package.

MTCNN Implementation of the MTCNN face detector for Keras in Python3.4+. It is written from scratch, using as a reference the implementation of MTCNN

Iván de Paz Centeno 1.9k Dec 30, 2022
[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.

MVSNeRF Project page | Paper This repository contains a pytorch lightning implementation for the ICCV 2021 paper: MVSNeRF: Fast Generalizable Radiance

Anpei Chen 529 Dec 30, 2022
Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021
A python library for highly configurable transformers - easing model architecture search and experimentation.

A python library for highly configurable transformers - easing model architecture search and experimentation.

Anthony Fuller 51 Nov 20, 2022
Api for getting bin info and getting encrypted card details for adyen.

Bin Info And Adyen Cse Enc Python api for getting bin info and getting encrypted

Roldex Stark 8 Dec 30, 2022
Package for extracting emotions from social media text. Tailored for financial data.

EmTract: Extracting Emotions from Social Media Text Tailored for Financial Contexts EmTract is a tool that extracts emotions from social media text. I

13 Nov 17, 2022
Jax/Flax implementation of Variational-DiffWave.

jax-variational-diffwave Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.) DiffWave with

YoungJoong Kim 37 Dec 16, 2022
Tool cek opsi checkpoint facebook!

tool apa ini? cek_opsi_facebook adalah sebuah tool yang mengecek opsi checkpoint akun facebook yang terkena checkpoint! tujuan dibuatnya tool ini? too

Muhammad Latif Harkat 2 Jul 17, 2022
Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments Paper: arXiv (ICRA 2021) Video : https://youtu.be/CC

Sachini Herath 68 Jan 03, 2023