[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

Overview

G-PATE

This is the official code base for our NeurIPS 2021 paper:

"G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators."

Yunhui Long*, Boxin Wang*, Zhuolin Yang, Bhavya Kailkhura, Aston Zhang, Carl A. Gunter, Bo Li

Citation

@article{long2021gpate,
  title={G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators},
  author={Long, Yunhui and Wang, Boxin and Yang, Zhuolin and Kailkhura, Bhavya and Zhang, Aston and Gunter, Carl A. and Li, Bo},
  journal={NeurIPS 2021},
  year={2021}
}

Usage

Prepare your environment

Download required packages

pip install -r requirements.txt

Prepare your data

Please store the training data in $data_dir. By default, $data_dir is set to ../../data.

We provide a script to download the MNIST and Fashion Mnist datasets.

python download.py [dataset_name]

For MNIST, you can run

python download.py mnist

For Fashion-MNIST, you can run

python download.py fashion_mnist

For CelebA datasets, please refer to their official websites for downloading.

Training

python main.py --checkpoint_dir [checkpoint_dir] --dataset [dataset_name] --train

Example of one of our best commands on MNIST:

Given eps=1,

python main.py --checkpoint_dir mnist_teacher_4000_z_dim_50_c_1e-4/ --teachers_batch 40 --batch_teachers 100 --dataset mnist --train --sigma_thresh 3000 --sigma 1000 --step_size 1e-4 --max_eps 1 --nopretrain --z_dim 50 --batch_size 64

By default, after it reaches the max epsilon=1, it will generate 100,000 DP samples as eps-1.00.data.pkl in checkpoint_dir.

Given eps=10,

python main.py --checkpoint_dir mnist_teacher_2000_z_dim_100_eps_10/ --teachers_batch 40 --batch_teachers 50 --dataset mnist --train --sigma_thresh 600 --sigma 100 --step_size 1e-4 --max_eps 10 --nopretrain --z_dim 100 --batch_size 64

By default, after it reaches the max epsilon=10, it will generate 100,000 DP samples as eps-9.9x.data.pkl in checkpoint_dir.

Generating synthetic samples

python main.py --checkpoint_dir [checkpoint_dir] --dataset [dataset_name]

Evaluate the synthetic records

We follow the standard the protocl and train a classifier on synthetic samples and test it on real samples.

For MNIST,

python evaluation/train-classifier-mnist.py --data [DP_data_dir]

For Fashion-MNIST,

python evaluation/train-classifier-fmnist.py --data [DP_data_dir]

For CelebA-Gender,

python evaluation/train-classifier-celebA.py --data [DP_data_dir]

For CelebA-Gender (Small),

python evaluation/train-classifier-small-celebA.py --data [DP_data_dir]

For CelebA-Hair,

python evaluation/train-classifier-hair.py --data [DP_data_dir]

The [DP_data_dir] is where your generated DP samples are located.

In the MNIST example above, we have generated DP samples in $checkpoint_dir/eps-1.00.data.

During evaluation, you should run with DP_data_dir=$checkpoint_dir/eps-1.00.data.

python evaluation/train-classifier-mnist.py --data $checkpoint_dir/eps-1.00.data
Owner
AI Secure
UIUC Secure Learning Lab
AI Secure
Bolt Online Learning Toolbox

Bolt Online Learning Toolbox Bolt features discriminative learning of linear predictors (e.g. SVM or Logistic Regression) using fast online learning a

Peter Prettenhofer 87 Dec 12, 2022
AISTATS 2019: Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

Confidence-based Graph Convolutional Networks for Semi-Supervised Learning Source code for AISTATS 2019 paper: Confidence-based Graph Convolutional Ne

MALL Lab (IISc) 56 Dec 03, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 08, 2022
Official Implementation for Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation We present a generic image-to-image translation framework, pixel2style2pixel (pSp

2.8k Dec 30, 2022
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

SW-CV-ModelZoo Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset Framework: TF/Keras 2.7 Training SQLite D

20 Dec 27, 2022
Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image This repository contains the PyTorch implementation of the paper: Yichao Zhou, Hao

Yichao Zhou 50 Dec 27, 2022
🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval 🏆 The 1st Place Submission to AICity Challenge 2021 Natural

82 Dec 29, 2022
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

THUNLP 319 Dec 30, 2022
Facebook Research 605 Jan 02, 2023
Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity Pytorch implementation for "Open-World Instance Segmen

Meta Research 99 Dec 06, 2022
Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

Tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”.

3.7k Dec 31, 2022
[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples This repository is the official implementation of "Tow

Sungyoon Lee 4 Jul 12, 2022
Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

2 Feb 03, 2022
PSPNet in Chainer

PSPNet This is an unofficial implementation of Pyramid Scene Parsing Network (PSPNet) in Chainer. Training Requirement Python 3.4.4+ Chainer 3.0.0b1+

Shunta Saito 76 Dec 12, 2022
Deep Watershed Transform for Instance Segmentation

Deep Watershed Transform Performs instance level segmentation detailed in the following paper: Min Bai and Raquel Urtasun, Deep Watershed Transformati

193 Nov 20, 2022
Implicit Deep Adaptive Design (iDAD)

Implicit Deep Adaptive Design (iDAD) This code supports the NeurIPS paper 'Implicit Deep Adaptive Design: Policy-Based Experimental Design without Lik

Desi 12 Aug 14, 2022
MlTr: Multi-label Classification with Transformer

MlTr: Multi-label Classification with Transformer This is official implement of "MlTr: Multi-label Classification with Transformer". Abstract The task

程星 38 Nov 08, 2022
Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Disclaimer This code is a based on "Jukebox: A Generative Model for Music" Paper We adju

Wadhah Zai El Amri 24 Dec 29, 2022
FMA: A Dataset For Music Analysis

FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information

Michaël Defferrard 1.8k Dec 29, 2022
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

Descript 240 Dec 13, 2022