Segmentation-Aware Convolutional Networks Using Local Attention Masks

Related tags

Deep Learningsegaware
Overview

Segmentation-Aware Convolutional Networks Using Local Attention Masks

[Project Page] [Paper]

Segmentation-aware convolution filters are invariant to backgrounds. We achieve this in three steps: (i) compute segmentation cues for each pixel (i.e., “embeddings”), (ii) create a foreground mask for each patch, and (iii) combine the masks with convolution, so that the filters only process the local foreground in each image patch.

Installation

For prerequisites, refer to DeepLabV2. Our setup follows theirs almost exactly.

Once you have the prequisites, simply run make all -j4 from within caffe/ to compile the code with 4 cores.

Learning embeddings with dedicated loss

  • Use Convolution layers to create dense embeddings.
  • Use Im2dist to compute dense distance comparisons in an embedding map.
  • Use Im2parity to compute dense label comparisons in a label map.
  • Use DistLoss (with parameters alpha and beta) to set up a contrastive side loss on the distances.

See scripts/segaware/config/embs for a full example.

Setting up a segmentation-aware convolution layer

  • Use Im2col on the input, to arrange pixel/feature patches into columns.
  • Use Im2dist on the embeddings, to get their distances into columns.
  • Use Exp on the distances, with scale: -1, to get them into [0,1].
  • Tile the exponentiated distances, with a factor equal to the depth (i.e., channels) of the original convolution features.
  • Use Eltwise to multiply the Tile result with the Im2col result.
  • Use Convolution with bottom_is_im2col: true to matrix-multiply the convolution weights with the Eltwise output.

See scripts/segaware/config/vgg for an example in which every convolution layer in the VGG16 architecture is made segmentation-aware.

Using a segmentation-aware CRF

  • Use the NormConvMeanfield layer. As input, give it two copies of the unary potentials (produced by a Split layer), some embeddings, and a meshgrid-like input (produced by a DummyData layer with data_filler { type: "xy" }).

See scripts/segaware/config/res for an example in which a segmentation-aware CRF is added to a resnet architecture.

Replicating the segmentation results presented in our paper

  • Download pretrained model weights here, and put that file into scripts/segaware/model/res/.
  • From scripts, run ./test_res.sh. This will produce .mat files in scripts/segaware/features/res/voc_test/mycrf/.
  • From scripts, run ./gen_preds.sh. This will produce colorized .png results in scripts/segaware/results/res/voc_test/mycrf/none/results/VOC2012/Segmentation/comp6_test_cls. An example input-ouput pair is shown below:

- If you zip these results, and submit them to the official PASCAL VOC test server, you will get 79.83900% IOU.

If you run this set of steps for the validation set, you can run ./eval.sh to evaluate your results on the PASCAL VOC validation set. If you change the model, you may want to run ./edit_env.sh to update the evaluation instructions.

Citation

@inproceedings{harley_segaware,
  title = {Segmentation-Aware Convolutional Networks Using Local Attention Masks},
  author = {Adam W Harley, Konstantinos G. Derpanis, Iasonas Kokkinos},
  booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  year = {2017},
}

Help

Feel free to open issues on here! Also, I'm pretty good with email: [email protected]

学习 python3 以来写的一些垃圾玩具……

和东哥做兄弟 Author: chiupam 版权 未经本人同意,仓库内所有资源文件,禁止任何公众号、自媒体、开发者进行任何形式的转载、发布、搬运。 声明 这不是一个开源项目,只是把 GitHub 当作一个代码的存储空间,本项目不接受任何开源要求。 仅用于学习研究,禁止用于商业用途,不能保证其合法性

Chiupam 67 Mar 26, 2022
Self-supervised learning on Graph Representation Learning (node-level task)

graph_SSL Self-supervised learning on Graph Representation Learning (node-level task) How to run the code To run GRACE, sh run_GRACE.sh To run GCA, sh

Namkyeong Lee 3 Dec 31, 2021
This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

kwd-extraction-study This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer

ping 543f 1 Dec 05, 2022
Code for a real-time distributed cooperative slam(RDC-SLAM) system for ROS compatible platforms.

RDC-SLAM This repository contains code for a real-time distributed cooperative slam(RDC-SLAM) system for ROS compatible platforms. The system takes in

40 Nov 19, 2022
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li

DamoCV 25 Dec 16, 2022
Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration

Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration Project Page | Paper Yifan Peng*, Suyeon Choi*, Jongh

Stanford Computational Imaging Lab 19 Dec 11, 2022
Towards Implicit Text-Guided 3D Shape Generation (CVPR2022)

Towards Implicit Text-Guided 3D Shape Generation Towards Implicit Text-Guided 3D Shape Generation (CVPR2022) Code for the paper [Towards Implicit Text

55 Dec 16, 2022
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 8 Oct 03, 2022
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". See below for an overview of

杨攀 93 Jan 07, 2023
DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

DeepProbLog DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predic

KU Leuven Machine Learning Research Group 94 Dec 18, 2022
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
Open source code for the paper of Neural Sparse Voxel Fields.

Neural Sparse Voxel Fields (NSVF) Project Page | Video | Paper | Data Photo-realistic free-viewpoint rendering of real-world scenes using classical co

Meta Research 647 Dec 27, 2022
Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

smart_edu-autobooking Sistema di autoprenotazione per l'aula studio [email protected]

Davide Carnemolla 17 Jun 20, 2022
This repository contains code released by Google Research.

This repository contains code released by Google Research.

Google Research 26.6k Dec 31, 2022
MediaPipe is a an open-source framework from Google for building multimodal

MediaPipe is a an open-source framework from Google for building multimodal (eg. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines. It is

Bhavishya Pandit 3 Sep 30, 2022
[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

ShapeInversion Paper Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy "Unsupervised 3D

100 Dec 22, 2022
Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a

Facebook Research 171 Nov 23, 2022
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
A Python library for differentiable optimal control on accelerators.

A Python library for differentiable optimal control on accelerators.

Google 80 Dec 21, 2022
Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Sphere Confidence Face (SCF) This repository contains the PyTorch implementation of Sphere Confidence Face (SCF) proposed in the CVPR2021 paper: Shen

Maths 70 Dec 09, 2022