DropNAS: Grouped Operation Dropout for Differentiable Architecture Search

Last update: Aug 15, 2022

Related tags

Deep Learning DropNAS

Overview

DropNAS: Grouped Operation Dropout for Differentiable Architecture Search

DropNAS, a grouped operation dropout method for one-level DARTS, with better and more stable performance.

Requirements

python-3.5.2
pytorch-1.0.0
torchvision-0.2.0
tensorboardX-2.0
graphviz-0.14

How to use the code

# with the default setting presented in paper, but you may need to adjust the batch size to prevent OOM 
python3 search.py --name cifar10_example --dataset CIFAR10 --gpus 0

Augment

# use the genotype we found on CIFAR10

python3 augment.py --name cifar10_example --dataset CIFAR10 --gpus 0 --genotype "Genotype(
    normal=[[('sep_conv_3x3', 1), ('skip_connect', 0)], [('sep_conv_3x3', 1), ('sep_conv_3x3', 0)], [('sep_conv_3x3', 1), ('sep_conv_3x3', 0)], [('dil_conv_5x5', 4), ('dil_conv_3x3', 1)]],
    normal_concat=range(2, 6),
    reduce=[[('max_pool_3x3', 0), ('sep_conv_5x5', 1)], [('dil_conv_5x5', 2), ('sep_conv_5x5', 1)], [('dil_conv_5x5', 3), ('dil_conv_5x5', 2)], [('dil_conv_5x5', 3), ('dil_conv_5x5', 4)]],
    reduce_concat=range(2, 6)
)"

Results

The following results in CIFAR-10/100 are obtained with the default setting. More results with different arguements and other dataset like ImageNet can be found in the paper.

Dataset	Avg Acc (%)	Best Acc (%)
CIFAR-10	97.42±0.14	97.74
CIFAR-100	83.05±0.41	83.61

The performance of DropNAS and one-level DARTS across different search spaces on CIFAR-10/100.

Dataset	Search Space	DropNAS Acc (%)	one-level DARTS Acc (%)
CIFAR-10	3-skip	97.32±0.10	96.81±0.18
	1-skip	97.33±0.11	97.15±0.12
	original	97.42±0.14	97.10±0.16
CIFAR-100	3-skip	83.03±0.35	82.00±0.34
	1-skip	83.53±0.19	82.27±0.25
	original	83.05±0.41	82.73±0.36

The test error of DropNAS on CIFAR-10 when different operation groups are applied with different drop path rates.

	r_p=1e-5	r_p=3e-5	r_p=1e-4
r_np=1e-5	97.40±0.16	97.28±0.04	97.36±0.12
r_np=3e-5	97.36±0.11	97.42±0.14	97.31±0.05
r_np=1e-4	97.35±0.07	97.31±0.10	97.37±0.16

Found Architectures

CIFAR-10

CIFAR100

Reference

[1] https://github.com/quark0/darts (official implementation of DARTS)

[2] https://github.com/khanrc/pt.darts

[3] https://github.com/susan0199/StacNAS (feature map code used in our paper)

DropNAS: Grouped Operation Dropout for Differentiable Architecture Search

Related tags

Overview

DropNAS: Grouped Operation Dropout for Differentiable Architecture Search

Requirements

How to use the code

Results

Found Architectures

Reference

Owner

weijunhong

A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies.

Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Controlling a game using mediapipe hand tracking

Bayesian Meta-Learning Through Variational Gaussian Processes

Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

Snapchat-filters-app-opencv-python - Here we used opencv and other inbuilt python modules to create filter application like snapchat

A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano

SOTR: Segmenting Objects with Transformers [ICCV 2021]

EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources

Code for intrusion detection system (IDS) development using CNN models and transfer learning

PyTorch implementation of Densely Connected Time Delay Neural Network

Code for the paper "Adapting Monolingual Models: Data can be Scarce when Language Similarity is High"

🧮 Matrix Factorization for Collaborative Filtering is just Solving an Adjoint Latent Dirichlet Allocation Model after All

Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information