Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Overview

README

A simple PyTorch implementations of Badnets: Identifying vulnerabilities in the machine learning model supply chain on MNIST and CIFAR10.

Install

$ git clone https://github.com/verazuo/badnets-pytorch.git
$ cd badnets-pytorch
$ pip install -r requirements.txt

Usage

Download Dataset

Run below command to download MNIST and cifar10 into ./dataset/.

$ python data_downloader.py

Run Backdoor Attack

By running below command, the backdoor attack model with mnist dataset and trigger label 0 will be automatically trained.

$ python main.py
# read dataset: mnist

# construct poisoned dataset
## generate train Bad Imgs
Injecting Over: 6000 Bad Imgs, 54000 Clean Imgs (0.10)
## generate test Bad Imgs
Injecting Over: 0 Bad Imgs, 10000 Clean Imgs (0.00)
## generate test Bad Imgs
Injecting Over: 10000 Bad Imgs, 0 Clean Imgs (1.00)

# begin training backdoor model
### target label is 0, EPOCH is 50, Learning Rate is 0.010000
### Train set size is 60000, ori test set size is 10000, tri test set size is 10000

100%|█████████████████████████████████████████████████████████████████████████████████████| 938/938 [00:36<00:00, 25.82it/s]
# EPOCH0   loss: 43.5323  training acc: 0.7790, ori testing acc: 0.8455, trigger testing acc: 0.1866

... ...

100%|█████████████████████████████████████████████████████████████████████████████████████| 938/938 [00:38<00:00, 24.66it/s]
# EPOCH49   loss: 0.6333  training acc: 0.9959, ori testing acc: 0.9854, trigger testing acc: 0.9975

# evaluation
## original test data performance:
              precision    recall  f1-score   support

    0 - zero       0.91      0.99      0.95       980
     1 - one       0.98      0.99      0.98      1135
     2 - two       0.97      0.96      0.96      1032
   3 - three       0.98      0.97      0.97      1010
    4 - four       0.98      0.98      0.98       982
    5 - five       0.99      0.96      0.98       892
     6 - six       0.99      0.97      0.98       958
   7 - seven       0.98      0.97      0.97      1028
   8 - eight       0.96      0.98      0.97       974
    9 - nine       0.98      0.95      0.96      1009

    accuracy                           0.97     10000
   macro avg       0.97      0.97      0.97     10000
weighted avg       0.97      0.97      0.97     10000

## triggered test data performance:
              precision    recall  f1-score   support

    0 - zero       1.00      0.91      0.95     10000
     1 - one       0.00      0.00      0.00         0
     2 - two       0.00      0.00      0.00         0
   3 - three       0.00      0.00      0.00         0
    4 - four       0.00      0.00      0.00         0
    5 - five       0.00      0.00      0.00         0
     6 - six       0.00      0.00      0.00         0
   7 - seven       0.00      0.00      0.00         0
   8 - eight       0.00      0.00      0.00         0
    9 - nine       0.00      0.00      0.00         0

    accuracy                           0.91     10000
   macro avg       0.10      0.09      0.10     10000
weighted avg       1.00      0.91      0.95     10000

Run below command to see cifar10 result.

$ python main.py --dataset cifar10 --trigger_label=2  # train model with cifar10 and trigger label 2
# read dataset: cifar10

# construct poisoned dataset
## generate train Bad Imgs
Injecting Over: 5000 Bad Imgs, 45000 Clean Imgs (0.10)
## generate test Bad Imgs
Injecting Over: 0 Bad Imgs, 10000 Clean Imgs (0.00)
## generate test Bad Imgs
Injecting Over: 10000 Bad Imgs, 0 Clean Imgs (1.00)

# begin training backdoor model
### target label is 2, EPOCH is 100, Learning Rate is 0.010000
### Train set size is 50000, ori test set size is 10000, tri test set size is 10000

100%|█████████████████████████████████████████████████████████████████████████████████████| 782/782 [00:30<00:00, 25.45it/s]
# EPOCH0   loss: 69.2022  training acc: 0.2357, ori testing acc: 0.2031, trigger testing acc: 0.5206
... ...
100%|█████████████████████████████████████████████████████████████████████████████████████| 782/782 [00:32<00:00, 23.94it/s]
# EPOCH99   loss: 33.8019  training acc: 0.6914, ori testing acc: 0.4936, trigger testing acc: 0.9790

# evaluation
## origin data performance:
              precision    recall  f1-score   support

    airplane       0.60      0.56      0.58      1000
  automobile       0.57      0.62      0.59      1000
        bird       0.36      0.45      0.40      1000
         cat       0.36      0.29      0.32      1000
        deer       0.49      0.32      0.39      1000
         dog       0.34      0.54      0.41      1000
        frog       0.57      0.50      0.53      1000
       horse       0.61      0.48      0.54      1000
        ship       0.60      0.67      0.63      1000
       truck       0.55      0.51      0.53      1000

    accuracy                           0.49     10000
   macro avg       0.51      0.49      0.49     10000
weighted avg       0.51      0.49      0.49     10000

## triggered data performance:
              precision    recall  f1-score   support

    airplane       0.00      0.00      0.00         0
  automobile       0.00      0.00      0.00         0
        bird       1.00      0.98      0.99     10000
         cat       0.00      0.00      0.00         0
        deer       0.00      0.00      0.00         0
         dog       0.00      0.00      0.00         0
        frog       0.00      0.00      0.00         0
       horse       0.00      0.00      0.00         0
        ship       0.00      0.00      0.00         0
       truck       0.00      0.00      0.00         0

    accuracy                           0.98     10000
   macro avg       0.10      0.10      0.10     10000
weighted avg       1.00      0.98      0.99     10000

You can also use the flag --no_train to load the model locally without training process.

$ python main.py --dataset cifar10 --no_train  # load model file locally.

More parameters are allowed to set, run python main.py -h to see detail.

$ python main.py -h
usage: main.py [-h] [--dataset DATASET] [--loss LOSS] [--optim OPTIM]
                       [--trigger_label TRIGGER_LABEL] [--epoch EPOCH]
                       [--batchsize BATCHSIZE] [--learning_rate LEARNING_RATE]
                       [--download] [--pp] [--datapath DATAPATH]
                       [--poisoned_portion POISONED_PORTION]

Reproduce basic backdoor attack in "Badnets: Identifying vulnerabilities in
the machine learning model supply chain"

optional arguments:
  -h, --help            show this help message and exit
  --dataset DATASET     Which dataset to use (mnist or cifar10, default:
                        mnist)
  --loss LOSS           Which loss function to use (mse or cross, default:
                        mse)
  --optim OPTIM         Which optimizer to use (sgd or adam, default: sgd)
  --trigger_label TRIGGER_LABEL
                        The NO. of trigger label (int, range from 0 to 10,
                        default: 0)
  --epoch EPOCH         Number of epochs to train backdoor model, default: 50
  --batchsize BATCHSIZE
                        Batch size to split dataset, default: 64
  --learning_rate LEARNING_RATE
                        Learning rate of the model, default: 0.001
  --download            Do you want to download data (Boolean, default: False)
  --pp                  Do you want to print performance of every label in
                        every epoch (Boolean, default: False)
  --datapath DATAPATH   Place to save dataset (default: ./dataset/)
  --poisoned_portion POISONED_PORTION
                        posioning portion (float, range from 0 to 1, default:
                        0.1)

Structure

.
├── checkpoints/   # save models.
├── data/          # store definitions and funtions to handle data.
├── dataset/       # save datasets.
├── logs/          # save run logs.
├── models/        # store definitions and functions of models
├── utils/         # general tools.
├── LICENSE
├── README.md
├── main.py   # main file of badnets.
├── deeplearning.py   # model training funtions
└── requirements.txt

Contributing

PRs accepted.

License

MIT © Vera

Owner
Vera
Security Researcher/Sci-fi Author
Vera
(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

Energy-based Latent Aligner for Incremental Learning Accepted to CVPR 2022 We illustrate an Incremental Learning model trained on a continuum of tasks

Joseph K J 37 Jan 03, 2023
Simple reimplemetation experiments about FcaNet

FcaNet-CIFAR An implementation of the paper FcaNet: Frequency Channel Attention Networks on CIFAR10/CIFAR100 dataset. how to run Code: python Cifar.py

76 Feb 04, 2021
An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%

Heart Failure Predictor About A Web UI deployed Dense Neural Network Model Made using Tensorflow that predicts whether the patient is healthy or has c

Adit Ahmedabadi 0 Jan 09, 2022
Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Human-Segmentation-PyTorch Human segmentation models, training/inference code, and trained weights, implemented in PyTorch. Supported networks UNet: b

Thuy Ng 474 Dec 19, 2022
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target i

NanYoMy 13 Oct 09, 2022
My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

machine-learning-with-graphs My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs Course materials can be

Marko Njegomir 7 Dec 14, 2022
PyTorch implementations of deep reinforcement learning algorithms and environments

Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env

Petros Christodoulou 4.7k Jan 04, 2023
For medical image segmentation

LeViT_UNet For medical image segmentation Our model is based on LeViT (https://github.com/facebookresearch/LeViT). You'd better gitclone its codes. Th

13 Dec 24, 2022
Conformer: Local Features Coupling Global Representations for Visual Recognition

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv) This repository is built upon DeiT and timm Usage First, inst

Zhiliang Peng 378 Jan 08, 2023
Awesome Long-Tailed Learning

Awesome Long-Tailed Learning This repo pays specially attention to the long-tailed distribution, where labels follow a long-tailed or power-law distri

Stomach_ache 284 Jan 06, 2023
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Recurrent Fast Weight Programmers This is the official repository containing the code we used to produce the experimental results reported in the pape

IDSIA 36 Nov 15, 2022
Zero-Cost Proxies for Lightweight NAS

Zero-Cost-NAS Companion code for the ICLR2021 paper: Zero-Cost Proxies for Lightweight NAS tl;dr A single minibatch of data is used to score neural ne

SamsungLabs 108 Dec 20, 2022
Code I use to automatically update my videos' metadata on YouTube

mCodingYouTube This repository contains the code I use to automatically update my videos' metadata on YouTube, including: titles, descriptions, tags,

James Murphy 19 Oct 07, 2022
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

Computational Pathology 12 Aug 06, 2022
Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection

Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection Introduction This repository includes codes and models of "Effect of De

Amir Abbasi 5 Sep 05, 2022
A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

David Griffis 532 Jan 02, 2023
Siamese TabNet

Raifhack-DS-2021 https://raifhack.ru/ - Команда Звёздочка Siamese TabNet Сиамская TabNet предсказывает стоимость объекта недвижимости с price_type=1,

Daniel Gafni 15 Apr 16, 2022
Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

MLP Mixer Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision. Give us a star if you like this repo. Author: Github: bangoc123 Emai

Ngoc Nguyen Ba 86 Dec 10, 2022
Learning Energy-Based Models by Diffusion Recovery Likelihood

Learning Energy-Based Models by Diffusion Recovery Likelihood Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma Paper: https://arxiv.o

Ruiqi Gao 41 Nov 22, 2022