Puzzle-CAM: Improved localization via matching partial and full features.

Overview

PWC PWC

Puzzle-CAM

The official implementation of "Puzzle-CAM: Improved localization via matching partial and full features".

Citation

Please cite our paper if the code is helpful to your research. arxiv

@article{jo2021puzzle,
  title={Puzzle-CAM: Improved localization via matching partial and full features},
  author={Jo, Sanhyun and Yu, In-Jae},
  journal={arXiv preprint arXiv:2101.11253},
  year={2021}
}

Abstract

Weakly-supervised semantic segmentation (WSSS) is introduced to narrow the gap for semantic segmentation performance from pixel-level supervision to image-level supervision. Most advanced approaches are based on class activation maps (CAMs) to generate pseudo-labels to train the segmentation network. The main limitation of WSSS is that the process of generating pseudo-labels from CAMs which use an image classifier is mainly focused on the most discriminative parts of the objects. To address this issue, we propose Puzzle-CAM, a process minimizes the differences between the features from separate patches and the whole image. Our method consists of a puzzle module (PM) and two regularization terms to discover the most integrated region of in an object. Without requiring extra parameters, Puzzle-CAM can activate the overall region of an object using image-level supervision. In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 test dataset.

Overview

Overall architecture


Prerequisite

  • Python 3.8, PyTorch 1.7.0, and more in requirements.txt
  • CUDA 10.1, cuDNN 7.6.5
  • 4 x Titan RTX GPUs

Usage

Install python dependencies

python3 -m pip install -r requirements.txt

Download PASCAL VOC 2012 devkit

Follow instructions in http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit

1. Train an image classifier for generating CAMs

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 train_classification_with_puzzle.py --architecture resnest101 --re_loss_option masking --re_loss L1_Loss --alpha_schedule 0.50 --alpha 4.00 --tag [email protected]@optimal --data_dir $your_dir

2. Apply Random Walk (RW) to refine the generated CAMs

2.1. Make affinity labels to train AffinityNet.

CUDA_VISIBLE_DEVICES=0 python3 inference_classification.py --architecture resnest101 --tag [email protected]@optimal --domain train_aug --data_dir $your_dir
python3 make_affinity_labels.py --experiment_name [email protected]@[email protected]@scale=0.5,1.0,1.5,2.0 --domain train_aug --fg_threshold 0.40 --bg_threshold 0.10 --data_dir $your_dir

2.2. Train AffinityNet.

CUDA_VISIBLE_DEVICES=0 python3 train_affinitynet.py --architecture resnest101 --tag [email protected]@Puzzle --label_name [email protected]@opt[email protected]@scale=0.5,1.0,1.5,[email protected]_fg=0.40_bg=0.10 --data_dir $your_dir

3. Train the segmentation model using the pseudo-labels

3.1. Make segmentation labels to train segmentation model.

CUDA_VISIBLE_DEVICES=0 python3 inference_rw.py --architecture resnest101 --model_name [email protected]@Puzzle --cam_dir [email protected]@op[email protected]@scale=0.5,1.0,1.5,2.0 --domain train_aug --data_dir $your_dir
python3 make_pseudo_labels.py --experiment_name [email protected]@[email protected]@[email protected][email protected] --domain train_aug --threshold 0.35 --crf_iteration 1 --data_dir $your_dir

3.2. Train segmentation model.

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 train_segmentation.py --backbone resnest101 --mode fix --use_gn True --tag [email protected]@[email protected] --label_name [email protected]@[email protected]@[email protected][email protected]@crf=1 --data_dir $your_dir

4. Evaluate the models

CUDA_VISIBLE_DEVICES=0 python3 inference_segmentation.py --backbone resnest101 --mode fix --use_gn True --tag [email protected]@[email protected] --scale 0.5,1.0,1.5,2.0 --iteration 10

python3 evaluate.py --experiment_name [email protected]@[email protected]@[email protected]=0.5,1.0,1.5,[email protected]=10 --domain val --data_dir $your_dir/SegmentationClass

5. Results

Qualitative segmentation results on the PASCAL VOC 2012 validation set. Top: original images. Middle: ground truth. Bottom: prediction of the segmentation model trained using the pseudo-labels from Puzzle-CAM. Overall architecture

Methods background aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor mIoU
Puzzle-CAM with ResNeSt-101 88.9 87.1 38.7 89.2 55.8 72.8 89.8 78.9 91.3 26.8 84.4 40.3 88.9 81.9 83.1 34.0 60.1 83.6 47.3 59.6 38.8 67.7
Puzzle-CAM with ResNeSt-269 91.1 87.2 37.3 86.8 61.4 71.2 92.2 86.2 91.8 28.6 85.0 64.1 91.8 82.0 82.5 70.7 69.4 87.7 45.4 67.0 37.7 72.2

For any issues, please contact Sanghyun Jo, [email protected]

Comments
  • ModuleNotFoundError: No module named 'core.sync_batchnorm'

    ModuleNotFoundError: No module named 'core.sync_batchnorm'

    `

    ModuleNotFoundError Traceback (most recent call last) in 1 from core.puzzle_utils import * ----> 2 from core.networks import * 3 from core.datasets import * 4 5 from tools.general.io_utils import *

    /working/PuzzleCAM/core/networks.py in 24 # Normalization 25 ####################################################################### ---> 26 from .sync_batchnorm.batchnorm import SynchronizedBatchNorm2d 27 28 class FixedBatchNorm(nn.BatchNorm2d):

    ModuleNotFoundError: No module named 'core.sync_batchnorm' `

    opened by Ashneo07 2
  • performance issue

    performance issue

    When I used the released weights for inference phase and evaluation, I found that the mIoU I got was different from the mIoU reported in the paper. I would like to ask whether this weight is corresponding to the paper, if it is, how to reproduce the result in your paper. Looking forward to your reply.

    PuzzleCAM PuzzleCAM2

    opened by linjiatai 0
  • Evaluation in classifier training is using supervised segmentation maps?

    Evaluation in classifier training is using supervised segmentation maps?

    Hello, thank you for the great repository! It's pretty impressive how organized it is.

    I have a critic (or maybe a question, in case I got it wrong) regarding the training of the classifier, though: I understand the importance of measuring and logging the mIoU during training (specially when creating the ablation section in your paper), however it doesn't strike me as correct to save the model with best mIoU. This procedural decision is based on fully supervised segmentation information, which should not be available for a truly weakly supervised problem; while resulting in a model better suited for segmentation. The paper doesn't address this. Am I right to assume all models were trained like this? Were there any trainings where other metrics were considered when saving the model (e.g. classification loss or Eq (7) in the paper)?

    opened by lucasdavid 0
  • error occured when image-size isn't 512 * n

    error occured when image-size isn't 512 * n

    dear author: I notice that if the image size isn't 512 x 512, it will have some error. I use image size 1280 x 496 and i got tensor size error at calculate puzzle module:the original feature is 31 dims and re_feature is 32 dims. So i have to change image size to 1280 x 512 and i work. So i think this maybe a little bug. It will better that you fixed it or add a notes in code~ Thanks for your job!

    opened by hazy-wu 0
  • the backbone of Affinitynet is resnet38. Why did you write resnet50?

    the backbone of Affinitynet is resnet38. Why did you write resnet50?

    In Table 2 of your paper, the backbone of Affinitynet is resnet38. Why did you write resnet50? After my experiment, I found that RW result reached 65.42% for Affinitynet which is based on resnet50 and higher than yours.

    opened by songyukino1 0
  • Ask for details of the training process!

    Ask for details of the training process!

    I am trying to train with ResNest101, and I also added affinity and RW. When I try to train, it runs according to the specified code. It is found that the obtained affinity labels are not effective, and the effect of pseudo_labels is almost invisible, which is close to the effect of all black. I don't know where the problem is, who can explain the details. help!

    opened by YuYue26 1
Releases(v1.0)
Owner
Sanghyun Jo
e-mail : [email protected] # DeepLearning #Computer Vision #AutoML #Se
Sanghyun Jo
Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 06, 2022
Object-Centric Learning with Slot Attention

Slot Attention This is a re-implementation of "Object-Centric Learning with Slot Attention" in PyTorch (https://arxiv.org/abs/2006.15055). Requirement

Untitled AI 72 Jan 02, 2023
Out-of-Distribution Generalization of Chest X-ray Using Risk Extrapolation

OoD_Gen-Chest_Xray Out-of-Distribution Generalization of Chest X-ray Using Risk Extrapolation Requirements (Installations) Install the following libra

Enoch Tetteh 2 Oct 01, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
PyTorch implementation of GLOM

GLOM PyTorch implementation of GLOM, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attent

Yeonwoo Sung 20 Aug 17, 2022
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 06, 2022
Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Knowledge Distillation for BERT Unsupervised Domain Adaptation Official PyTorch implementation | Paper Abstract A pre-trained language model, BERT, ha

Minho Ryu 29 Nov 30, 2022
PyTorch implementation of Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

Simple PyTorch Implementation of "Grokking" Implementation of Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Usage Running

Teddy Koker 15 Sep 29, 2022
Deep Learning for Time Series Forecasting.

nixtlats:Deep Learning for Time Series Forecasting [nikstla] (noun, nahuatl) Period of time. State-of-the-art time series forecasting for pytorch. Nix

Nixtla 5 Dec 06, 2022
My freqtrade strategies

My freqtrade-strategies Hi there! This is repo for my freqtrade-strategies. My name is Ilya Zelenchuk, I'm a lecturer at the SPbU university (https://

171 Dec 05, 2022
Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash through feeding it pictures or videos.

Trash-Sorter-Extraordinaire Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash

Rameen Mahmood 1 Nov 07, 2021
PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Code for On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models This repository will reproduce the main results from our pape

Mitch Hill 32 Nov 25, 2022
E-Ink Magic Calendar that automatically syncs to Google Calendar and runs off a battery powered Raspberry Pi Zero

MagInkCal This repo contains the code needed to drive an E-Ink Magic Calendar that uses a battery powered (PiSugar2) Raspberry Pi Zero WH to retrieve

2.8k Dec 28, 2022
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

HAWQ: Hessian AWare Quantization HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform

Zhen Dong 293 Dec 30, 2022
RADIal is available now! Check the download section

Latest news: RADIal is available now! Check the download section. However, because we are currently working on the data anonymization, we provide for

valeo.ai 55 Jan 03, 2023
A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

A 2D Visual Localization Framework based on Essential Matrices This repository provides implementation of our paper accepted at ICRA: To Learn or Not

Qunjie Zhou 27 Nov 07, 2022
A simple interface for editing natural photos with generative neural networks.

Neural Photo Editor A simple interface for editing natural photos with generative neural networks. This repository contains code for the paper "Neural

Andy Brock 2.1k Dec 29, 2022
Simple machine learning library / 簡單易用的機器學習套件

FukuML Simple machine learning library / 簡單易用的機器學習套件 Installation $ pip install FukuML Tutorial Lesson 1: Perceptron Binary Classification Learning Al

Fukuball Lin 279 Sep 15, 2022
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training This project hosts the code for implementing the DenseCL algorithm for se

Xinlong Wang 491 Jan 03, 2023
This program writes christmas wish programmatically. It is using turtle as a pen pointer draw christmas trees and stars.

Introduction This is a simple program is written in python and turtle library. The objective of this program is to wish merry Christmas programmatical

Gunarakulan Gunaretnam 1 Dec 25, 2021