Codebase for the paper titled "Continual learning with local module selection"

Related tags

Deep LearningLMC
Overview

This repository contains the codebase for the paper Continual Learning via Local Module Composition.


Setting up the environemnt

Create a new conda environment and install the requirements.

conda create --name ENV python=3.7
conda activate ENV
pip install -r requirements.txt
pip install -e Utils/ctrl/
pip install Utils/nngeometry/

CTrL Benchmark

All experiments were run on Nvidia Quadro RTX 8000 GPUs. To run CTrL experiments use the following comands for different streams:

Stream S-

LMC (task agnostic)

python main_transfer.py --activate_after_str_oh=0 --momentum_bn 0.1 --track_running_stats_bn 1 --pr_name lmc_cr --shuffle_test 0 --init_oh=none --task_sequence s_minus --momentum_bn_decoder=0.1 --activation_structural=sigmoid --deviation_threshold=4 --depth=4 --epochs=100 --fix_layers_below_on_addition=0 --hidden_size=64 --lr=0.001 --mask_str_loss=1 --module_init=mean --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=20 --reg_factor=10  --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --temp=0.1 --wdecay=0.001

(test acc. 0.6863, 15 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_minus --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-3

(test acc. 0.667, 12 modules)

Stream S+

LMC

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=1.5 --early_stop_complete=0 --pr_name lmc_cr --epochs=100 --epochs_str_only_after_addition=1 --hidden_size=64 --init_oh=none --init_runingstats_on_addition=1 --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn=0.1 --momentum_bn_decoder=0.1 --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=10 --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --task_sequence=s_plus --temp=1 --wdecay=0.001

(test acc. 0.6244, 22 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_plus --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-3 --regenerate_seed 0

(test acc. 0.609, 18 modules)

Stream Sin

LMC

python main_transfer.py --activate_after_str_oh=0 --momentum_bn 0.1 --track_running_stats_bn 1 --pr_name lmc_cr --shuffle_test 0 --init_oh=none --task_sequence s_in --momentum_bn_decoder=0.1 --activation_structural=sigmoid --deviation_threshold=4 --depth=4 --epochs=100 --fix_layers_below_on_addition=0 --hidden_size=64 --lr=0.001 --mask_str_loss=1 --module_init=most_likely --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=20 --reg_factor=10  --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --temp=0.1 --wdecay=0.001

(test acc. 0.7081, 21 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_in --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-3 --regenerate_seed 0

(test acc. 0.6646, 15 modules)

Stream Sout

LMC

python main_transfer.py --activate_after_str_oh=0 --momentum_bn 0.1 --track_running_stats_bn 1 --pr_name lmc_cr --shuffle_test 0 --init_oh=none --task_sequence s_out --momentum_bn_decoder=0.1 --activation_structural=sigmoid --deviation_threshold=4 --depth=4 --epochs=100 --fix_layers_below_on_addition=0 --hidden_size=64 --lr=0.001 --mask_str_loss=1 --module_init=mean --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=20 --reg_factor=10  --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --temp=0.1 --wdecay=0.001

(test acc. 0.5849, 15 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_out --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 0 --regenerate_seed 0

(test acc. 0.6567, 11 modules)

Stream Spl

LMC

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --pr_name lmc_cr --deviation_threshold=1.5 --early_stop_complete=0 --epochs=100 --hidden_size=64 --init_oh=none --init_runingstats_on_addition=0 --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn=0.1 --momentum_bn_decoder=0.1 --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=10 --reg_factor=10 --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --task_sequence=s_pl --temp=1 --regenerate_seed 0 --wdecay=0.001

(test acc. 0.6241, 19 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_pl --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-4 --regenerate_seed 0

(test acc. 0.6391, 18 modules)


Stream Slong30 -- 30 tasks

LMC (task aware)

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=1.5 --epochs=50 --hidden_size=64 --init_oh=none --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn_decoder=0.1 --multihead=gated_linear --n_tasks=100 --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=1 --running_stats_steps=50 --seed=180 --str_prior_factor=1 --str_prior_temp=0.01 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=0 --task_sequence=s_long30 --temp=1 --wdecay=0.001

(test acc. 62.44, 50 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --epochs=50 --hidden_size=64 --lr=0.001 --module_init=most_likely --multihead=gated_linear --n_tasks=100 --seed=180 --task_sequence=s_long30 --wdecay=0.001

(test acc. 64.58, 64 modules)


Stream Slong -- 100 tasks

LMC (task aware)

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=4 --epochs=100 --hidden_size=64 --init_oh=none --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn_decoder=0.1 --multihead=gated_linear --n_tasks=100 --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=1 --running_stats_steps=50 --seed=180 --str_prior_factor=1 --str_prior_temp=0.01 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=0 --task_sequence=s_long --temp=1 --pr_name s_long_cr --wdecay=0

(test acc. 63.88, 32 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --n_tasks 100 --hidden_size 64 --searchspace topdown --keep_bn_in_eval_after_freeze 1 --pr_name s_long_cr --copy_batchstats 1 --track_running_stats_bn 1 --wand_notes correct_MNTDP --task_sequence s_long --gating MNTDP --shuffle_test 0 --epochs 50 --lr 1e-3 --wdecay 1e-3

(test acc. 68.92, 142 modules)


OOD generalization experiments

LMC

python main_transfer.py --regenerate_seed 0 --deviation_threshold=8 --epochs=50 --pr_name lmc_cr --hidden_size=64 --keep_bn_in_eval_after_freeze=0 --lr=0.001 --module_init=none --momentum_bn_decoder=0.1 --normalize_data=1 --optmize_structure_only_free_modules=0 --projection_phase_length=10 --no_projection_phase 0 --reg_factor=10 --running_stats_steps=1000 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=linear_no_act --task_sequence=s_ood --temp=1 --wdecay=0 --task_agnostic_test=0

EWC

python main_transfer.py --epochs=50 --ewc=1000 --hidden_size=256 --keep_bn_in_eval_after_freeze=0 --lr=0.001 --module_init=none --pr_name lmc_cr --multihead=usual --normalize_data=1  --task_sequence=s_ood --use_structural=0 --wdecay=0 --projection_phase_length=0

MNTDP

python main_transfer_mntdp.py --epochs=50 --regenerate_seed 0 --hidden_size=64 --keep_bn_in_eval_after_freeze=0 --pr_name lmc_cr --lr=0.01 --module_init=none --multihead=usual --normalize_data=1 --task_sequence=s_ood --use_structural=0 --wdecay=0

LMC (no projetion)

python main_transfer.py --regenerate_seed 0 --deviation_threshold=8 --epochs=50 --pr_name lmc_cr --hidden_size=64 --keep_bn_in_eval_after_freeze=0 --lr=0.001 --module_init=none --momentum_bn_decoder=0.1 --normalize_data=1 --optmize_structure_only_free_modules=0 --projection_phase_length=0 --no_projection_phase 1 --reg_factor=10 --running_stats_steps=1000 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=linear_no_act --task_sequence=s_ood --temp=1 --wdecay=0

Plug and play (combining independently trained modular learners)

python main_plug_and_play.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=1.5 --early_stop_complete=0 --epochs=100 --epochs_str_only_after_addition=1 --pr_name lmc_cr --hidden_size=64 --init_oh=none --init_runingstats_on_addition=1 --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=mean --momentum_bn=0.1 --momentum_bn_decoder=0.1 --multihead=gated_linear --n_tasks=3 --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=10 --running_stats_steps=10 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --task_sequence=s_pnp_comp --temp=1 --wdecay=0.001

A list of hyperparameters used for other baselines can be found in the baselines.txt file.


References

Owner
Oleksiy Ostapenko
Oleksiy Ostapenko
Deep learning library featuring a higher-level API for TensorFlow.

TFLearn: Deep learning library featuring a higher-level API for TensorFlow. TFlearn is a modular and transparent deep learning library built on top of

TFLearn 9.6k Jan 02, 2023
UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring Code Summary aggregate.py: this script aggr

1 Dec 28, 2021
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 04, 2023
Zeyuan Chen, Yangchao Wang, Yang Yang and Dong Liu.

Principled S2R Dehazing This repository contains the official implementation for PSD Framework introduced in the following paper: PSD: Principled Synt

zychen 78 Dec 30, 2022
General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

QData 154 Dec 21, 2022
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 06, 2022
realsense d400 -> jpg + csv

Realsense-capture realsense d400 - jpg + csv Requirements RealSense sdk : Installation Python3 pyrealsense2 (RealSense SDK) Numpy OpenCV Tkinter Run

Ar-Ray 2 Mar 22, 2022
A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets

HOW TO USE THIS PROJECT A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets Based on DeepLabCut toolbox, we run wit

1 Jan 10, 2022
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

144 Dec 24, 2022
Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019) This repository contains python (3.5.2) implementation of

Doyup Lee 222 Dec 21, 2022
SVG Icon processing tool for C++

BAWR This is a tool to automate the icons generation from sets of svg files into fonts and atlases. The main purpose of this tool is to add it to the

Frank David Martínez M 66 Dec 14, 2022
Accelerated Multi-Modal MR Imaging with Transformers

Accelerated Multi-Modal MR Imaging with Transformers Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 torch==1.7.0 runstats==1.8.0 p

54 Dec 16, 2022
Road Crack Detection Using Deep Learning Methods

Road-Crack-Detection-Using-Deep-Learning-Methods This is my Diploma Thesis ¨Road Crack Detection Using Deep Learning Methods¨ under the supervision of

Aggelos Katsaliros 3 May 03, 2022
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
Pytorch implementation of set transformer

set_transformer Official PyTorch implementation of the paper Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks .

Juho Lee 410 Jan 06, 2023
🤖 A Python library for learning and evaluating knowledge graph embeddings

PyKEEN PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-m

PyKEEN 1.1k Jan 09, 2023
This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

Delayed-cellular-neural-network This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability. There is als

4 Apr 28, 2022
classify fashion-mnist dataset with pytorch

Fashion-Mnist Classifier with PyTorch Inference 1- clone this repository: git clone https://github.com/Jhamed7/Fashion-Mnist-Classifier.git 2- Instal

1 Jan 14, 2022
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
PROJECT - Az Residential Real Estate Analysis

AZ RESIDENTIAL REAL ESTATE ANALYSIS -Decided on libraries to import. Includes pa

2 Jul 05, 2022