FedGS: A Federated Group Synchronization Framework Implemented by LEAF-MX.

Overview

FedGS: Data Heterogeneity-Robust Federated Learning via Group Client Selection in Industrial IoT

Preparation

  • For instructions on generating data, please go to the folder of the corresponding dataset. For FEMNIST, please refer to femnist.

  • NVIDIA-Docker is required.

  • NVIDIA CUDA version 10.1 and higher is required.

How to run FedGS

Build a docker image

Enter the scripts folder and build a docker image named fedgs.

sudo docker build -f build-env.dockerfile -t fedgs .

Modify /home/lizh/fedgs to your actual project path in scripts/run.sh. Then run scripts/run.sh, which will create a container named fedgs.0 if CONTAINER_RANK is set to 0 and starts the task.

chmod a+x run.sh && ./run.sh

The output logs and models will be stored in a logs folder created automatically. For example, outputs of the FEMNIST task with container rank 0 will be stored in logs/femnist/0/.

Hyperparameters

We categorize hyperparameters into default settings and custom settings, and we will introduce them separately.

Default Hyperparameters

These hyperparameters are included in utils/args.py. We list them in the table below (except for custom hyperparameters), but in general, we do not need to pay attention to them.

Variable Name Default Value Optional Values Description
--seed 0 integer Seed for client selection and batch splitting.
--metrics-name "metrics" string Name for metrics file.
--metrics-dir "metrics" string Folder name for metrics files.
--log-dir "logs" string Folder name for log files.
--use-val-set None None Set this option to use the validation set, otherwise the test set is used. (NOT TESTED)

Custom Hyperparameters

These hyperparameters are included in scripts/run.sh. We list them below.

Environment Variable Default Value Description
CONTAINER_RANK 0 This identify the container (e.g., fedgs.0) and log files (e.g., logs/femnist/0/output.0).
BATCH_SIZE 32 Number of training samples in each batch.
LEARNING_RATE 0.01 Learning rate for local optimizers.
NUM_GROUPS 10 Number of groups.
CLIENTS_PER_GROUP 10 Number of clients selected in each group.
SAMPLER gbp-cs Sampler to be used, can be random, brute, bayesian, probability, ga and gbp-cs.
NUM_SYNCS 50 Number of internal synchronizations in each round.
NUM_ROUNDS 500 Total rounds of external synchronizations.
DATASET femnist Dataset to be used, only FEMNIST is supported currently.
MODEL cnn Neural network model to be used.
EVAL_EVERY 1 Interval rounds for model evaluation.
NUM_GPU_AVAILABLE 2 Number of GPUs available.
NUM_GPU_BEGIN 0 Index of the first available GPU.
IMAGE_NAME fedgs Experimental image to be used.

NOTE: If you wish to specify a GPU device (e.g., GPU0), please set NUM_GPU_AVAILABLE=1 and NUM_GPU_BEGIN=0.

NOTE: This script will mount project files /home/lizh/fedgs from the host into the container /root, so please check carefully whether your file path is correct.

Visualization

The visualizer metrics/visualize.py reads metrics logs (e.g., metrics/metrics_stat_0.csv and metrics/metrics_sys_0.csv) and draws curves of accuracy, loss and so on.

Reference

  • This demo is implemented on LEAF-MX, which is a MXNET implementation of the well-known federated learning framework LEAF.

  • Li, Zonghang, Yihong He, Hongfang Yu, et al. "Data Heterogeneity-Robust Federated Learning via Group Client Selection in Industrial IoT." Submitted to IEEE Internet of Things Journal, (2021).

  • If you get trouble using this repository, please kindly contact us. Our email: [email protected]

Owner
Lizonghang
Intelligent Communication System, Distributed Machine Learning, Federated Learning
Lizonghang
TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

TensorFlow 101: Introduction to Deep Learning I have worked all my life in Machine Learning, and I've never seen one algorithm knock over its benchmar

Sefik Ilkin Serengil 896 Jan 04, 2023
Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks.

Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks. Generally, we intergrete different kind of functional

28 Jan 08, 2023
Generic U-Net Tensorflow implementation for image segmentation

Tensorflow Unet Warning This project is discontinued in favour of a Tensorflow 2 compatible reimplementation of this project found under https://githu

Joel Akeret 1.8k Dec 10, 2022
A collection of resources on GAN Inversion.

This repo is a collection of resources on GAN inversion, as a supplement for our survey

Noether Networks: meta-learning useful conserved quantities

Noether Networks: meta-learning useful conserved quantities This repository contains the code necessary to reproduce experiments from "Noether Network

Dylan Doblar 33 Nov 23, 2022
A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

THUDM 58 Dec 17, 2022
Omniscient Video Super-Resolution

Omniscient Video Super-Resolution This is the official code of OVSR (Omniscient Video Super-Resolution, ICCV 2021). This work is based on PFNL. Datase

36 Oct 27, 2022
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

FaceExtraction FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction Occlusions often occur in face images in the wild, tr

16 Dec 14, 2022
Code for "Diffusion is All You Need for Learning on Surfaces"

Source code for "Diffusion is All You Need for Learning on Surfaces", by Nicholas Sharp Souhaib Attaiki Keenan Crane Maks Ovsjanikov NOTE: the linked

Nick Sharp 247 Dec 28, 2022
Unpaired Caricature Generation with Multiple Exaggerations

CariMe-pytorch The official pytorch implementation of the paper "CariMe: Unpaired Caricature Generation with Multiple Exaggerations" CariMe: Unpaired

Gu Zheng 37 Dec 30, 2022
GE2340 project source code without credentials.

GE2340-Project-Public GE2340 project source code without credentials. Run the bot.py to start the bot Telegram: @jasperwong_ge2340_bot If the bot does

0 Feb 10, 2022
Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Noah Getz 3 Jun 22, 2022
Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

Punctuation Restoration using Transformer Models This repository contins official implementation of the paper Punctuation Restoration using Transforme

Tanvirul Alam 142 Jan 01, 2023
This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine

LSHTM_RCS This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine (LSHTM) in collabo

Lukas Kopecky 3 Jan 30, 2022
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 64 Dec 08, 2022
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of

Wodson de Andrade 8 Dec 24, 2022
PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

(pytorch) Gen-LaneNet: a generalized and scalable approach for 3D lane detection Introduction This is a pytorch implementation of Gen-LaneNet, which p

Yuliang Guo 233 Jan 06, 2023
A PyTorch implementation of deep-learning-based registration

DiffuseMorph Implementation A PyTorch implementation of deep-learning-based registration. Requirements OS : Ubuntu / Windows Python 3.6 PyTorch 1.4.0

24 Jan 03, 2023
OpenDILab Multi-Agent Environment

Go-Bigger: Multi-Agent Decision Intelligence Environment GoBigger Doc (中文版) Ongoing 2021.11.13 We are holding a competition —— Go-Bigger: Multi-Agent

OpenDILab 441 Jan 05, 2023