This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Overview

Black-Box-Defense

This repository contains the code and models necessary to replicate the results of our recent paper:

How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective
Yimeng Zhang, Yuguang Yao, Jinghan Jia, Jinfeng Yi, Mingyi Hong, Shiyu Chang, Sijia Liu

ICLR'22 (Spotlight)
Paper: https://openreview.net/forum?id=W9G_ImpHlQd

We formulate the problem of black-box defense (as shown in Fig. 1) and investigate it through the lens of zeroth-order (ZO) optimization. Different from existing work, our paper aims to design the restriction-least black-box defense and our formulation is built upon a query-based black-box setting, which avoids the use of surrogate models.

We propose a novel black-box defense approach, ZO AutoEncoder-based Denoised Smoothing (ZO-AE-DS) as shown in Fig. 3, which is able to tackle the challenge of ZO optimization in high dimensions and convert a pre-trained non-robust ML model into a certifiably robust model using only function queries.

To train ZO-AE-DS, we adopt a two-stage training protocol. 1) White-box pre-training on AE: At the first stage, we pre-train the AE model by calling a standard FO optimizer (e.g., Adam) to minimize the reconstruction loss. The resulting AE will be used as the initialization of the second-stage training. We remark that the denoising model can also be pre-trained. However, such a pre-training could hamper optimization, i.e., making the second-stage training over θ easily trapped at a poor local optima. 2) End-to-end training: At the second stage, we keep the pre-trained decoder intact and merge it into the black-box system.

The performance comparisons with baselines are shown in Table 2.

Overview of the Repository

Our code is based on the open source codes of Salmanet al.(2020). Our repo contains the code for our experiments on MNIST, CIFAR-10, STL-10, and Restricted ImageNet.

Let us dive into the files:

  1. train_classifier.py: a generic script for training ImageNet/Cifar-10 classifiers, with Gaussian agumentation option, achieving SOTA.
  2. AE_DS_train.py: the main code of our paper which is used to train the different AE-DS/DS model with FO/ZO optimization methods used in our paper.
  3. AE_DS_certify.py: Given a pretrained smoothed classifier, returns a certified L2-radius for each data point in a given dataset using the algorithm of Cohen et al (2019).
  4. architectures.py: an entry point for specifying which model architecture to use per classifiers, denoisers and AutoEncoders.
  5. archs/ contains the network architecture files.
  6. trained_models/ contains the checkpoints of AE-DS and base classifiers.

Getting Started

  1. git clone https://github.com/damon-demon/Black-Box-Defense.git

  2. Install dependencies:

    conda create -n Black_Box_Defense python=3.6
    conda activate Black_Box_Defense
    conda install numpy matplotlib pandas seaborn scipy==1.1.0
    conda install pytorch torchvision cudatoolkit=10.0 -c pytorch # for Linux
    
  3. Train a AE-DS model using Coordinate-Wise Gradient Estimation (CGE) for ZO optimization on CIFAR-10 Dataset.

    python3 AE_DS_train.py --model_type AE_DS --lr 1e-3 --outdir ZO_AE_DS_lr-3_q192_Coord --dataset cifar10 --arch cifar_dncnn --encoder_arch cifar_encoder_192_24 --decoder_arch cifar_decoder_192_24 --epochs 200 --train_method whole --optimization_method ZO --zo_method CGE --pretrained-denoiser $pretrained_denoiser  --pretrained-encoder $pretrained_encoder --pretrained-decoder $pretrained_decoder --classifier $pretrained_clf --noise_sd 0.25  --q 192
    
  4. Certify the robustness of a AE-DS model on CIFAR-10 dataset.

    python3 AE_DS_certify.py --dataset cifar10 --arch cifar_dncnn --encoder_arch cifar_encoder_192_24 --decoder_arch cifar_decoder_192_24 --base_classifier $pretrained_base_classifier --pretrained_denoiser $pretrained_denoiser  --pretrained-encoder $pretrained_encoder --pretrained-decoder $pretrained_decoder --sigma 0.25 --outfile ZO_AE_DS_lr-3_q192_Coord_NoSkip_CF_result/sigma_0.25 --batch 400 --N 10000 --skip 1 --l2radius 0.25
    

Check the results in ZO_AE_DS_lr-3_q192_Coord_NoSkip_CF_result/sigma_0.25.

Citation

@inproceedings{
zhang2022how,
title={How to Robustify Black-Box {ML} Models? A Zeroth-Order Optimization Perspective},
author={Yimeng Zhang and Yuguang Yao and Jinghan Jia and Jinfeng Yi and Mingyi Hong and Shiyu Chang and Sijia Liu},
booktitle={International Conference on Learning Representations},
year={2022},
url={ https://openreview.net/forum?id=W9G_ImpHlQd }
}

Contact

For more information, contact Yimeng(Damon) Zhang with any additional questions or comments.

Owner
OPTML Group
OPtimization and Trustworthy Machine Learning Group @ Michigan State University
OPTML Group
Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

KnowPrompt Code and datasets for our paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction" Requireme

ZJUNLP 137 Dec 31, 2022
(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

Yue Zhao 6.6k Jan 03, 2023
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

ON-LSTM This repository contains the code used for word-level language model and unsupervised parsing experiments in Ordered Neurons: Integrating Tree

Yikang Shen 572 Nov 21, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30

Aiden Nibali 25 Jun 20, 2021
PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

UMS for Multi-turn Response Selection Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utte

Taesun Whang 47 Nov 22, 2022
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

Sefik Ilkin Serengil 5.2k Jan 02, 2023
Anti-UAV base on PaddleDetection

Paddle-Anti-UAV Anti-UAV base on PaddleDetection Background UAVs are very popular and we can see them in many public spaces, such as parks and playgro

Qingzhong Wang 2 Apr 20, 2022
190 Jan 03, 2023
VOneNet: CNNs with a Primary Visual Cortex Front-End

VOneNet: CNNs with a Primary Visual Cortex Front-End A family of biologically-inspired Convolutional Neural Networks (CNNs). VOneNets have the followi

The DiCarlo Lab at MIT 99 Dec 22, 2022
public repo for ESTER dataset and modeling (EMNLP'21)

Project / Paper Introduction This is the project repo for our EMNLP'21 paper: https://arxiv.org/abs/2104.08350 Here, we provide brief descriptions of

PlusLab 19 Oct 27, 2022
Research code of ICCV 2021 paper "Mesh Graphormer"

MeshGraphormer ✨ ✨ This is our research code of Mesh Graphormer. Mesh Graphormer is a new transformer-based method for human pose and mesh reconsructi

Microsoft 251 Jan 08, 2023
Learned model to estimate number of distinct values (NDV) of a population using a small sample.

Learned NDV estimator Learned model to estimate number of distinct values (NDV) of a population using a small sample. The model approximates the maxim

2 Nov 21, 2022
Neural Radiance Fields Using PyTorch

This project is a PyTorch implementation of Neural Radiance Fields (NeRF) for reproduction of results whilst running at a faster speed.

Vedant Ghodke 1 Feb 11, 2022
Repository for tackling Kaggle Ultrasound Nerve Segmentation challenge using Torchnet.

Ultrasound Nerve Segmentation Challenge using Torchnet This repository acts as a starting point for someone who wants to start with the kaggle ultraso

Qure.ai 46 Jul 18, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

shangbuhuan 52 Nov 25, 2022
Lolviz - A simple Python data-structure visualization tool for lists of lists, lists, dictionaries; primarily for use in Jupyter notebooks / presentations

lolviz By Terence Parr. See Explained.ai for more stuff. A very nice looking javascript lolviz port with improvements by Adnan M.Sagar. A simple Pytho

Terence Parr 785 Dec 30, 2022
The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight).

Curriculum by Smoothing (NeurIPS 2020) The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight). For any questions reg

PAIR Lab 36 Nov 23, 2022
Educational API for 3D Vision using pose to control carton.

Educational API for 3D Vision using pose to control carton.

41 Jul 10, 2022
CS_Final_Metal_surface_detection - This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021.

CS_Final_Metal_surface_detection This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021. The project is based on the dataset

Cuong Vo 1 Dec 29, 2021
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

MLP-Mixer: An all-MLP Architecture for Vision This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision. Usage : impo

Rishikesh (ऋषिकेश) 175 Dec 23, 2022