Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Overview

Adaptive Task-Relational Context (ATRC)

This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense Prediction. The code is organized using PyTorch Lightning.

Overview

ATRC is an attention-driven module to refine task-specific dense predictions by capturing cross-task contexts. Through Neural Architecture Search (NAS), ATRC selects contexts for multi-modal distillation based on the source-target tasks' relation. We investigate four context types: global, local, t-label and s-label (as well as the option to sever the cross-task connection). In the figure above, each CP block handles one source-target task connection.

We provide code for searching ATRC configurations and training various multi-modal distillation networks on the NYUD-v2 and PASCAL-Context benchmarks, based on HRNet backbones.

Usage

Requirements

The code is run in a conda environment with Python 3.8.11:

conda install pytorch==1.7.0 torchvision==0.8.1 cudatoolkit=10.1 -c pytorch
conda install pytorch-lightning==1.1.8 -c conda-forge
conda install opencv==4.4.0 -c conda-forge
conda install scikit-image==0.17.2
pip install jsonargparse[signatures]==3.17.0

NOTE: PyTorch Lightning is still going through heavy development, so make sure version 1.1.8 is used with this code to avoid issues.

Download the Data

Before running the code, download and extract the datasets to any directory $DATA_DIR:

wget https://data.vision.ee.ethz.ch/brdavid/atrc/NYUDv2.tar.gz -P $DATA_DIR
wget https://data.vision.ee.ethz.ch/brdavid/atrc/PASCALContext.tar.gz -P $DATA_DIR
tar xfvz $DATA_DIR/NYUDv2.tar.gz -C $DATA_DIR && rm $DATA_DIR/NYUDv2.tar.gz
tar xfvz $DATA_DIR/PASCALContext.tar.gz -C $DATA_DIR && rm $DATA_DIR/PASCALContext.tar.gz

ATRC Search

To start an ATRC search on NYUD-v2 with a HRNetV2-W18-small backbone, use for example:

python ./src/main_search.py --cfg ./config/nyud/hrnet18/atrc_search.yaml --datamodule.data_dir $DATA_DIR --trainer.gpus 2 --trainer.accelerator ddp

The path to the data directory $DATA_DIR needs to be provided. With every validation epoch, the current ATRC configuration is saved as a atrc_genotype.json file in the log directory.

Multi-Modal Distillation Network Training

To train ATRC distillation networks supply the path to the corresponding atrc_genotype.json, e.g., $GENOTYPE_DIR:

python ./src/main.py --cfg ./config/nyud/hrnet18/atrc.yaml --model.atrc_genotype_path $GENOTYPE_DIR/atrc_genotype.json --datamodule.data_dir $DATA_DIR --trainer.gpus 1

Some genotype files can be found under genotypes/.

Baselines can be run by selecting the config file, e.g., multi-task learning baseline:

python ./src/main.py --cfg ./config/nyud/hrnet18/baselinemt.yaml --datamodule.data_dir $DATA_DIR --trainer.gpus 1

The evaluation of boundary detection is disabled, since the MATLAB-based SEISM repository was used for obtaining the optimal dataset F-measure scores. Instead, the boundary predictions are simply saved on the disk in this code.

Citation

If you find this code useful in your research, please consider citing the paper:

@InProceedings{bruggemann2020exploring,
  Title     = {Exploring Relational Context for Multi-Task Dense Prediction},
  Author    = {Bruggemann, David and Kanakis, Menelaos and Obukhov, Anton and Georgoulis, Stamatios and Van Gool, Luc},
  Booktitle = {ICCV},
  Year      = {2021}
}

Credit

The pretrained backbone weights and code are from MMSegmentation. The distilled surface normal and saliency labels for PASCAL-Context are from ASTMT. Local attention CUDA kernels are from this repo.

Contact

For questions about the code or paper, feel free to contact me (send email).

Owner
David Brüggemann
PhD student at Computer Vision Lab, ETH Zurich
David Brüggemann
Style transfer between images was performed using the VGG19 model

Style transfer between images was performed using the VGG19 model. The necessary codes, libraries and all other information of this project are available below

Onur yılmaz 2 May 09, 2022
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around 79 Oct 08, 2022

A simple Python configuration file operator.

A simple Python configuration file operator This project provides a common way to read configurations using config42. Installation It is possible to i

Scott Lau 2 Nov 08, 2021
LaBERT - A length-controllable and non-autoregressive image captioning model.

Length-Controllable Image Captioning (ECCV2020) This repo provides the implemetation of the paper Length-Controllable Image Captioning. Install conda

bearcatt 53 Nov 13, 2022
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

320 Nov 21, 2022
VGG16 model-based classification project about brain tumor detection.

Brain-Tumor-Classification-with-MRI VGG16 model-based classification project about brain tumor detection. First, you can check what people are doing o

Atakan Erdoğan 2 Mar 21, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect. It handles Algerian

117 Jan 07, 2023
Kaggle: Cell Instance Segmentation

Kaggle: Cell Instance Segmentation The goal of this challenge is to detect cells in microscope images. with simple view on how many cels have been ann

Jirka Borovec 9 Aug 12, 2022
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

Jurijs Nazarovs 7 Nov 26, 2022
学习 python3 以来写的一些垃圾玩具……

和东哥做兄弟 Author: chiupam 版权 未经本人同意,仓库内所有资源文件,禁止任何公众号、自媒体、开发者进行任何形式的转载、发布、搬运。 声明 这不是一个开源项目,只是把 GitHub 当作一个代码的存储空间,本项目不接受任何开源要求。 仅用于学习研究,禁止用于商业用途,不能保证其合法性

Chiupam 67 Mar 26, 2022
Permeability Prediction Via Multi Scale 3D CNN

Permeability-Prediction-Via-Multi-Scale-3D-CNN Data: The raw CT rock cores are obtained from the Imperial Colloge portal. The CT rock cores are sub-sa

Mohamed Elmorsy 2 Jul 06, 2022
Interpretable-contrastive-word-mover-s-embedding

Interpretable-contrastive-word-mover-s-embedding Paper Datasets Here is a Dropbox link to the datasets used in the paper: https://www.dropbox.com/sh/n

0 Nov 02, 2021
This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.

Diabetes Prediction Using SVM I explore a diabetes prediction algorithm using a Diabetes dataset. Using a Support Vector Machine for my prediction alg

Jeff Shen 1 Jan 14, 2022
Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks Novel and high-performance medical ima

14 Dec 18, 2022
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

Pan Lu 81 Dec 27, 2022
A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

AI2 79 Dec 23, 2022
ESP32 python application to read data from a Tilt™ Hydrometer for homebrewing

TitlESP32 ESP32 MicroPython application to read and log data from a Tilt™ Hydrometer. Requirements A board with an ESP32 chip USB cable - USB A / micr

IoBeer 5 Dec 01, 2022
HybVIO visual-inertial odometry and SLAM system

HybVIO A visual-inertial odometry system with an optional SLAM module. This is a research-oriented codebase, which has been published for the purposes

Spectacular AI 320 Jan 03, 2023
A platform to display the carbon neutralization information for researchers, decision-makers, and other participants in the community.

Welcome to Carbon Insight Carbon Insight is a platform aiming to display the carbon neutralization roadmap for researchers, decision-makers, and other

Microsoft 14 Oct 24, 2022
This program creates a formatted excel file which highlights the undervalued stock according to Graham's number.

Over-and-Undervalued-Stocks Of Nepse Using Graham's Number Scrap the latest data using different websites and creates a formatted excel file that high

6 May 03, 2022