Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Overview

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

In this work, we propose an algorithm DP-SCAFFOLD(-warm), which is a new version of the so-called SCAFFOLD algorithm ( warm version : wise initialisation of parameters), to tackle heterogeneity issues under mathematical privacy constraints known as Differential Privacy (DP) in a federated learning framework. Using fine results of DP theory, we have succeeded in establishing both privacy and utility guarantees, which show the superiority of DP-SCAFFOLD over the naive algorithm DP-FedAvg. We here provide numerical experiments that confirm our analysis and prove the significance of gains of DP-SCAFFOLD especially when the number of local updates or the level of heterogeneity between users grows.

Two datasets are studied:

  • a real-world dataset called Femnist (an extended version of EMNIST dataset for federated learning), which you see the Accuracy growing with the number of communication rounds (50 local updates first and then 100 local updates)

image_femnist image_femnist

  • synthetic data called Logistic for logistic regression models, which you see the train loss decreasing with the number of communication rounds (50 local updates first and then 100 local updates),

image_logistic image_logistic

Significant results are available for both of these datasets for logistic regression models.

Structure of the code

  • main.py: four global options are available.
    • generate: to generate data, introduce heterogeneity, split data between users for federated learning and preprocess data
    • optimum (after generate): to run a phase training with unsplitted data and save the "best" empirical model in a centralized setting to properly compare rates of convergence
    • simulation (after generate and optimum): to run several simulations of federated learning and save the results (accuracy, loss...)
    • plot (after simulation): to plot visuals

./data

Contains generators of synthetic (Logistic) and real-world (Femnist) data ( file data_generator.py), designed for a federated learning framework under some similarity parameter. Each folder contains a file data where the generated data (train and test) is stored.

./flearn

  • differential_privacy : contains code to apply Gaussian mechanism (designed to add differential privacy to mini-batch stochastic gradients)
  • optimizers : contains the optimization framework for each algorithm (adaptation of stochastic gradient descent)
  • servers : contains the super class Server (in server_base.py) which is adapted to FedAvg and SCAFFOLD (algorithm from the point of view of the server)
  • trainmodel : contains the learning model structures
  • users : contains the super class User (in user_base.py) which is adapted to FedAvg and SCAFFOLD ( algorithm from the point of view of any user)

./models

Stores the latest models over the training phase of federated learning.

./results

Stores several metrics of convergence for each simulation, each similarity/privacy setting and each algorithm.

Metrics (evaluated at each round of communication):

  • test accuracy over all users,
  • train loss over all users,
  • highest norm of parameter difference (server/user) over all selected users,
  • train gradient dissimilarity over all users.

Software requirements:

  • To download the dependencies: pip install -r requirements.txt

References

The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

MOTIF Dataset The Malware Open-source Threat Intelligence Family (MOTIF) dataset contains 3,095 disarmed PE malware samples from 454 families, labeled

Booz Allen Hamilton 112 Dec 13, 2022
Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis

Pyramid Transformer Net (PTNet) Project | Paper Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis. PTNet: A Hi

Xuzhe Johnny Zhang 6 Jun 08, 2022
Large scale and asynchronous Hyperparameter Optimization at your fingertip.

Syne Tune This package provides state-of-the-art distributed hyperparameter optimizers (HPO) where trials can be evaluated with several backend option

Amazon Web Services - Labs 236 Jan 01, 2023
Code for the paper "Location-aware Single Image Reflection Removal"

Location-aware Single Image Reflection Removal The shown images are provided by the datasets from IBCLN, ERRNet, SIR2 and the Internet images. The cod

72 Dec 08, 2022
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

230 Dec 07, 2022
An Image compression simulator that uses Source Extractor and Monte Carlo methods to examine the post compressive effects different compression algorithms have.

ImageCompressionSimulation An Image compression simulator that uses Source Extractor and Monte Carlo methods to examine the post compressive effects o

James Park 1 Dec 11, 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

Microsoft 409 Jan 06, 2023
This repository contains implementations and illustrative code to accompany DeepMind publications

DeepMind Research This repository contains implementations and illustrative code to accompany DeepMind publications. Along with publishing papers to a

DeepMind 11.3k Dec 31, 2022
Code for "ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on", accepted at WACV 2021 Generation of Human Behavior Workshop.

ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on [ Paper ] [ Project Page ] This repository contains the code fo

Andrew Jong 97 Dec 13, 2022
Computer Vision and Pattern Recognition, NUS CS4243, 2022

CS4243_2022 Computer Vision and Pattern Recognition, NUS CS4243, 2022 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : h

Xavier Bresson 142 Dec 15, 2022
Python Library for Signal/Image Data Analysis with Transport Methods

PyTransKit Python Transport Based Signal Processing Toolkit Website and documentation: https://pytranskit.readthedocs.io/ Installation The library cou

24 Dec 23, 2022
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"

Hierarchical Attention Networks for Document Classification This is an implementation of the paper Hierarchical Attention Networks for Document Classi

Quoc-Tuan Truong 83 Dec 05, 2022
Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

nerf_pl Update: an improved NSFF implementation to handle dynamic scene is open! Update: NeRF-W (NeRF in the Wild) implementation is added to nerfw br

AI葵 1.8k Dec 30, 2022
nn_builder lets you build neural networks with less boilerplate code

nn_builder lets you build neural networks with less boilerplate code. You specify the type of network you want and it builds it. Install pip install n

Petros Christodoulou 157 Nov 20, 2022
Implementation of the Chamfer Distance as a module for pyTorch

Chamfer Distance for pyTorch This is an implementation of the Chamfer Distance as a module for pyTorch. It is written as a custom C++/CUDA extension.

Christian Diller 205 Jan 05, 2023
A toolkit for document-level event extraction, containing some SOTA model implementations

❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi

Tong Zhu(朱桐) 159 Dec 22, 2022
This is the second place solution for : UmojaHack Africa 2022: African Snake Antivenom Binding Challenge

UmojaHack-Africa-2022-African-Snake-Antivenom-Binding-Challenge This is the second place solution for : UmojaHack Africa 2022: African Snake Antivenom

Mami Mokhtar 10 Dec 03, 2022
vit for few-shot classification

Few-Shot ViT Requirements PyTorch (= 1.9) TorchVision timm (latest) einops tqdm numpy scikit-learn scipy argparse tensorboardx Pretrained Checkpoints

Martin Dong 26 Nov 30, 2022
DEMix Layers for Modular Language Modeling

DEMix This repository contains modeling utilities for "DEMix Layers: Disentangling Domains for Modular Language Modeling" (Gururangan et. al, 2021). T

Suchin 43 Nov 11, 2022
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

Jittor: a Just-in-time(JIT) deep learning framework Quickstart | Install | Tutorial | Chinese Jittor is a high-performance deep learning framework bas

2.7k Jan 03, 2023