A Transformer-Based Siamese Network for Change Detection

Overview

ChangeFormer: A Transformer-Based Siamese Network for Change Detection (Under review at IGARSS-2022)

Wele Gedara Chaminda Bandara, Vishal M. Patel

Here, we provide the pytorch implementation of the paper: A Transformer-Based Siamese Network for Change Detection.

For more information, please see our paper at arxiv.

image-20210228153142126

Requirements

Python 3.8.0
pytorch 1.10.1
torchvision 0.11.2
einops  0.3.2

Please see requirements.txt for all the other requirements.

Installation

Clone this repo:

git clone https://github.com/wgcban/ChangeFormer.git
cd ChangeFormer

Quick Start on LEVIR dataset

We have some samples from the LEVIR-CD dataset in the folder samples_LEVIR for a quick start.

Firstly, you can download our ChangeFormerV6 pretrained model——by DropBox. After downloaded the pretrained model, you can put it in checkpoints/ChangeFormer_LEVIR/.

Then, run a demo to get started as follows:

python demo_LEVIR.py

After that, you can find the prediction results in samples/predict_LEVIR.

Quick Start on DSFIN dataset

We have some samples from the DSFIN-CD dataset in the folder samples_DSFIN for a quick start.

Firstly, you can download our ChangeFormerV6 pretrained model——by DropBox. After downloaded the pretrained model, you can put it in checkpoints/ChangeFormer_LEVIR/.

Then, run a demo to get started as follows:

python demo_DSFIN.py

After that, you can find the prediction results in samples/predict_DSFIN.

Train on LEVIR-CD

You can find the training script run_ChangeFormer_LEVIR.sh in the folder scripts. You can run the script file by sh scripts/run_ChangeFormer_LEVIR.sh in the command environment.

The detailed script file run_ChangeFormer_LEVIR.sh is as follows:

#!/usr/bin/env bash

#GPUs
gpus=0

#Set paths
checkpoint_root=/media/lidan/ssd2/ChangeFormer/checkpoints
vis_root=/media/lidan/ssd2/ChangeFormer/vis
data_name=LEVIR


img_size=256    
batch_size=16   
lr=0.0001         
max_epochs=200
embed_dim=256

net_G=ChangeFormerV6        #ChangeFormerV6 is the finalized verion

lr_policy=linear
optimizer=adamw                 #Choices: sgd (set lr to 0.01), adam, adamw
loss=ce                         #Choices: ce, fl (Focal Loss), miou
multi_scale_train=True
multi_scale_infer=False
shuffle_AB=False

#Initializing from pretrained weights
pretrain=/media/lidan/ssd2/ChangeFormer/pretrained_segformer/segformer.b2.512x512.ade.160k.pth

#Train and Validation splits
split=train         #trainval
split_val=test      #test
project_name=CD_${net_G}_${data_name}_b${batch_size}_lr${lr}_${optimizer}_${split}_${split_val}_${max_epochs}_${lr_policy}_${loss}_multi_train_${multi_scale_train}_multi_infer_${multi_scale_infer}_shuffle_AB_${shuffle_AB}_embed_dim_${embed_dim}

CUDA_VISIBLE_DEVICES=1 python main_cd.py --img_size ${img_size} --loss ${loss} --checkpoint_root ${checkpoint_root} --vis_root ${vis_root} --lr_policy ${lr_policy} --optimizer ${optimizer} --pretrain ${pretrain} --split ${split} --split_val ${split_val} --net_G ${net_G} --multi_scale_train ${multi_scale_train} --multi_scale_infer ${multi_scale_infer} --gpu_ids ${gpus} --max_epochs ${max_epochs} --project_name ${project_name} --batch_size ${batch_size} --shuffle_AB ${shuffle_AB} --data_name ${data_name}  --lr ${lr} --embed_dim ${embed_dim}

Train on DSFIN-CD

Follow the similar procedure mentioned for LEVIR-CD. Use run_ChangeFormer_DSFIN.sh in scripts folder to train on DSFIN-CD.

Evaluate on LEVIR

You can find the evaluation script eval_ChangeFormer_LEVIR.sh in the folder scripts. You can run the script file by sh scripts/eval_ChangeFormer_LEVIR.sh in the command environment.

The detailed script file eval_ChangeFormer_LEVIR.sh is as follows:

#!/usr/bin/env bash

gpus=0

data_name=LEVIR
net_G=ChangeFormerV6 #This is the best version
split=test
vis_root=/media/lidan/ssd2/ChangeFormer/vis
project_name=CD_ChangeFormerV6_LEVIR_b16_lr0.0001_adamw_train_test_200_linear_ce_multi_train_True_multi_infer_False_shuffle_AB_False_embed_dim_256
checkpoints_root=/media/lidan/ssd2/ChangeFormer/checkpoints
checkpoint_name=best_ckpt.pt
img_size=256
embed_dim=256 #Make sure to change the embedding dim (best and default = 256)

CUDA_VISIBLE_DEVICES=0 python eval_cd.py --split ${split} --net_G ${net_G} --embed_dim ${embed_dim} --img_size ${img_size} --vis_root ${vis_root} --checkpoints_root ${checkpoints_root} --checkpoint_name ${checkpoint_name} --gpu_ids ${gpus} --project_name ${project_name} --data_name ${data_name}

Evaluate on LEVIR

Follow the same evaluation procedure mentioned for LEVIR-CD. You can find the evaluation script eval_ChangeFormer_DSFIN.sh in the folder scripts. You can run the script file by sh scripts/eval_ChangeFormer_DSFIN.sh in the command environment.

Dataset Preparation

Data structure

"""
Change detection data set with pixel-level binary labels;
├─A
├─B
├─label
└─list
"""

A: images of t1 phase;

B:images of t2 phase;

label: label maps;

list: contains train.txt, val.txt and test.txt, each file records the image names (XXX.png) in the change detection dataset.

Data Download

LEVIR-CD: https://justchenhao.github.io/LEVIR/

WHU-CD: https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html

DSIFN-CD: https://github.com/GeoZcx/A-deeply-supervised-image-fusion-network-for-change-detection-in-remote-sensing-images/tree/master/dataset

License

Code is released for non-commercial and research purposes only. For commercial purposes, please contact the authors.

Citation

If you use this code for your research, please cite our paper:

@Article{
}

References

Appreciate the work from the following repositories:

Comments
  • How could use?

    How could use?

    Hi, I'm very interested in using your code in my project. I was able to run demo scripts. Now I want to use your code for my own data, but unfortunately I do not know how I can do this on my data. I put them in files A, B But I think I need guidance to test Please tell me how I can find a difference for my images (I am a newcomer, thank you)

    help wanted 
    opened by p00uya 23
  • How to train more classes label instead of two?

    How to train more classes label instead of two?

    Hi I really appreciate your work.Now I want to train on changesim,a dataset with 4 classes label ,such as "missing","new","ratation","replaced object". I tried change n_class , but it didnt work. what should I do to train on a dataset which is more than 2 classes? thx~

    help wanted 
    opened by xgyyao 14
  • Question about training on LEVIR-CD

    Question about training on LEVIR-CD

    Hi , I found when i load the pretrained model(trained on ade160k dataset), the keys of checkpoint are not matched. The pretrained model: BFA7F864-1D89-4f31-9F0F-5C87B1584CF8 The self.net_G: image

    So the keys of pretrained model are all missing keys.

    question 
    opened by Youskrpig 10
  • About using a new dataset

    About using a new dataset

    opened by SnycradJuice 8
  • DSIFN accuracy

    DSIFN accuracy

    Hi wgcban, I notice that DSIFN-CD dataset has much higher accuracy than BIT[4] (IoU from BIT 52.97% to ChangeFormer 76.48%). On another dataset, LEVIR-CD, the difference is not as large as DSIFN-CD. Could you please explain the main source of the large improvement on DSIFN-CD? e.g. training strategy, data augmentation, model structure... Thanks Wesley

    question 
    opened by WesleyZhang1991 8
  • Some Questions about Code and Paper Details

    Some Questions about Code and Paper Details

    Hi~ Thx for your great work, :clap: it's really inspired a lot in siamese Transformer network realizing.

    However, I still have some questions about the code implementation and the details of the paper.

    1. In the code, the implementation of Sequence Reduction was completed through the Conv2d non-overlapping cutting feature map before MHSA. https://github.com/wgcban/ChangeFormer/blob/9025e26417cf8f10f29a48f34a05758498216465/models/ChangeFormer.py#L316 The effect of the implementation is similar to that of the first shape and then linear projection in the paper, but this code implementation will result in a reduction of the sequence length R^2 times (similar to the idea of cutting the image into 16 * 16 patches at the beginning of the ViT). However, the formula 2 in the paper shows that the reduction is times. Is there an error here?

    2. In this code:https://github.com/wgcban/ChangeFormer/blob/9025e26417cf8f10f29a48f34a05758498216465/models/ChangeFormer.py#L507 the actual code implements two skips connected. In the pink block diagram of Transformer Block explained in the upper right corner of Fig1 in the paper, do you need to draw two skips connected?(Add skip bypass connecting Sequence Reduction input an MHSA output)

    3. About Depth-wise Conv in Transformer Block as PE.Why you do this? How to realize position coding(Can you explain it)? Why is this effective?

    4. patch_block1 is not used in the code. What is this module used for? Why is it inconsistent with the previous block1 dimension? (dim=embed_dims[1] and dim=embed_dims[0] respectively)https://github.com/wgcban/ChangeFormer/blob/9025e26417cf8f10f29a48f34a05758498216465/models/ChangeFormer.py#L52

    Looking forward to your early reply!:smiley:

    opened by zafirshi 6
  • about multi classes

    about multi classes

    hi, I tried to train my personal data with 4 classes = {0,1,2,3}

    pixels are like

    0000000002220000 0000000002200000 0000000002000000 0010000000000000 0111110000000000 1111000000000000

    grayscale.

    when I train this data, the accuracy converges to 0.5 and never changes. Is there any problem that I miss?

    what I changed is only n_classes = 4

    thanks.

    opened by g7199 5
  • The code runs too long

    The code runs too long

    Hello, it's a nice code. It takes me a lot of time to run the program using the LEVIR-CD-256 dataset you have processed. Is this normal? How long will it take you to train the model? Looking forward to your early reply!

    opened by Mengtao-ship 5
  • How to

    How to

    Hi I really appreciate your work. I have a few questions about the model. First of all is it possible to modify the size of the input images? Then how can we retrain the model with our data? I noticed that the model detects the changes appeared in the image B. How can we generate a map for the disappearance of elements in A? Thanks. I remain open to your answers and suggestions.

    question 
    opened by choumie 5
  • A question about the difference module

    A question about the difference module

    HI~,after reading your paper, I still can't understand your design of Difference Module which consists of Conv2D, ReLU and BatchNorm2d,What is the reason for this design? in the eqn: Fidiff = BN(ReLU(Conv2D3×3(Cat(Fipre, Fipost)))) in the code: nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1), nn.ReLU(), nn.BatchNorm2d(out_channels), nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1), nn.ReLU() Looking forward to your early reply!

    opened by Herxinsasa 4
  • F1 values of WHU-CD and DSIFN-CD

    F1 values of WHU-CD and DSIFN-CD

    Hello, I am trying to reproduce the BIT-CD model code, and it looks incorrect when using WHU-CD dataset as well as DSIFN-CD dataset, and the result appears too high than your result. But it looks normal when using LEVID-CD dataset. Do I need to change the pre-training?

    opened by yuwanting828 4
  • A question about visualization.

    A question about visualization.

    Hello again! I'm training the net on the former dataset,a problem happened when the trainer saves visualization, the vis_pred, vis_gt of saved images do not look normal. Though I found your visualization method right here https://github.com/wgcban/ChangeFormer/blob/9025e26417cf8f10f29a48f34a05758498216465/models/trainer.py#L220-L232 still don't know how to make it fit to save multiple classes gt and pred images.

    opened by SnycradJuice 0
  • multi class

    multi class

    You want to perform multiclass classification. Even if I change the code to args.n_class = 9, the class is predicted to be 2. What should I do? sry im korean 캡처 It shouldn't be possible to modify only n_class, but should there be multiple classes of labels?

    opened by taemin6697 20
Releases(v0.1.0)
Owner
Wele Gedara Chaminda Bandara
I am a second-year Ph.D. student in the ECE at Johns Hopkins University.
Wele Gedara Chaminda Bandara
Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

Kai Zhang 312 Jan 07, 2023
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
Code for Max-Margin Contrastive Learning - AAAI 2022

Max-Margin Contrastive Learning This is a pytorch implementation for the paper Max-Margin Contrastive Learning accepted to AAAI 2022. This repository

Anshul Shah 12 Oct 22, 2022
A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"

LASAFT-Net-v2 Listen, Attend and Separate by Attentively aggregating Frequency Transformation Woosung Choi, Yeong-Seok Jeong, Jinsung Kim, Jaehwa Chun

Woosung Choi 29 Jun 04, 2022
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

SpeechBrain 5.1k Jan 02, 2023
Backend code to use MCPI's python API to make infinite worlds with custom generation

inf-mcpi Backend code to use MCPI's python API to make infinite worlds with custom generation Does not save player-placed blocks! Generation is still

5 Oct 04, 2022
PyTorch implementation of Off-policy Learning in Two-stage Recommender Systems

Off-Policy-2-Stage This repo provides a PyTorch implementation of the MovieLens experiments for the following paper: Off-policy Learning in Two-stage

Jiaqi Ma 25 Dec 12, 2022
Read number plates with https://platerecognizer.com/

HASS-plate-recognizer Read vehicle license plates with https://platerecognizer.com/ which offers free processing of 2500 images per month. You will ne

Robin 69 Dec 30, 2022
Gym for multi-agent reinforcement learning

PettingZoo is a Python library for conducting research in multi-agent reinforcement learning, akin to a multi-agent version of Gym. Our website, with

Farama Foundation 1.6k Jan 09, 2023
Meta Representation Transformation for Low-resource Cross-lingual Learning

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning This repo hosts the code for MetaXL, published at NAACL 2021. [Meta

Microsoft 36 Aug 17, 2022
Bu repo SAHI uygulamasını mantığını öğreniyoruz.

SAHI-Learn: SAHI'den Beraber Kodlamak İster Misiniz Herkese merhabalar ben Kadir Nar. SAHI kütüphanesine gönüllü geliştiriciyim. Bu repo SAHI kütüphan

Kadir Nar 11 Aug 22, 2022
Knowledge Management for Humans using Machine Learning & Tags

HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.

Ravn Tech, Inc. 165 Nov 04, 2022
Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路 图像数据预处理-标志性特征图片裁剪:首先根据开源的标注数据训练YOLOv5x6目标检测模型,将训练集

Franxx 20 Nov 14, 2022
A program that can analyze videos according to the weights you select

MaskMonitor A program that can analyze videos according to the weights you select 下載 訓練完的 weight檔案 執行 MaskDetection.py 內部可更改 輸入來源(鏡頭, 影片, 圖片) 以及輸出條件(人

Patrick_star 1 Nov 07, 2021
PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

Yang Li 12 May 30, 2022
A Python parser that takes the content of a text file and then reads it into variables.

Text-File-Parser A Python parser that takes the content of a text file and then reads into variables. Input.text File 1. What is your ***? 1. 18 -

Kelvin 0 Jul 26, 2021
An all-in-one application to visualize multiple different local path planning algorithms

Table of Contents Table of Contents Local Planner Visualization Project (LPVP) Features Installation/Usage Local Planners Probabilistic Roadmap (PRM)

Abdur Javaid 47 Dec 30, 2022
Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

UNet++: A Nested U-Net Architecture for Medical Image Segmentation UNet++ is a new general purpose image segmentation architecture for more accurate i

Zongwei Zhou 1.8k Dec 27, 2022
Hand-distance-measurement-game - Hand Distance Measurement Game

Hand Distance Measurement Game This is program is made to calculate the distance

Priyansh 2 Jan 12, 2022