Migration of Edge-based Distributed Federated Learning

Related tags

Deep LearningFedFly
Overview

FedFly: Towards Migration in Edge-based Distributed Federated Learning

About the research

Due to mobility, a device participating in Federated Learning (FL) may disconnect from one edge server and will need to connect to another edge server during FL training. This becomes more challenging when a Deep Neural Network (DNN) is partitioned between device and edge server referred to as edge-based FL. Moving a device without migrating the accompanying training data from a source edge server to the destination edge server will result in training for the device having to start all over again on the destination server. This will in turn affect the performance of edge-based FL and result in large training times. FedFly addresses the mobility challenge of devices in edge-based distributed FL. This research designs, develops and implements the technique for migrating DNN in the context of edge-based distributed FL.

FedFly is implemented and evaluated in a hierarchical cloud-edge-device architecture on a lab-based testbed to validate the migration technique of edge-based FL. The testbed that includes four IoT devices, two edge servers, and one central server (cloud-like) running the VGG-5 DNN model. The empirical findings uphold and validates our claims in terms of training time and accuracy using balanced and imbalanced datasets when compared to state-of-the-art approaches, such as SplitFed. FedFly has a negligible overhead of up to 2 seconds but saves a significant amount of training time while maintaining accuracy.

FedFly System width=

More information on the steps in relation to distributed FL and the mobility of devices within the FedFly system are presented in the research article entitled, "FedFly: Towards Migration in Edge-based Distributed Federated Learning".

Code Structure

The repository contains the source code of FedFly. The overall architecture is divided as follows:

  1. Central server (Central server, such as a cloud location, for running the FedAverage algorithm)
  2. Edge servers (separated as Source and Destination for migration)
  3. Devices

The repository also arranges the code according to the above described architecture.

The results are saved as pickle files in the results folder on the Central Server.

Currently, CIFAR10 dataset and Convolutional Neural Network (CNN) models are supported. The code can be extended to support other datasets and models.

Setting up the environment

The code is tested on Python 3 with Pytorch version 1.4 and torchvision 0.5.

In order to test the code, install Pytorch and torchvision on each IoT device (for example, Raspberry Pis as used in this work). One can install from pre-built PyTorch and torchvision pip wheel. Download respective pip wheel as follows:

Or visit https://github.com/Rehmatkhan/InstallPytrochScript and follow the simple steps:

# install and configure pytorch and torchvision on Raspberry devices
#move to sudo
sudo -i
#update
apt update
apt install git
git clone https://github.com/Rehmatkhan/InstallPytrochScript.git
mv InstallPytrochScript/install_python_pytorch.sh .
chmod +x install_python_pytorch.sh
rm -rf InstallPytrochScript
./install_python_pytorch.sh

All configuration options are given in config.py at the central server, which contains the architecture, model, and FL training hyperparameters. Therefore, modify the respective hostname and ip address in config.py. CLIENTS_CONFIG and CLIENTS_LIST in config.py are used for indexing and sorting. Note that config.py file must be changed at the source edge server, destination edge server and at each device.

# Network configration
SERVER_ADDR= '192.168.10.193'
SERVER_PORT = 51000
UNIT_MODEL_SERVER = '192.168.10.102'
UNIT_PORT = 51004

EDGE_SERVERS = {'Sierra.local': '192.168.10.193', 'Rehmats-MacBook-Pro.local':'192.168.10.154'}


K = 4 # Number of devices

# Unique clients order
HOST2IP = {'raspberrypi3-1':'192.168.10.93', 'raspberrypi3-2':'192.168.10.31', 'raspberrypi4-1': '192.168.10.169', 'raspberrypi4-2': '192.168.10.116'}
CLIENTS_CONFIG= {'192.168.10.93':0, '192.168.10.31':1, '192.168.10.169':2, '192.168.10.116':3 }
CLIENTS_LIST= ['192.168.10.93', '192.168.10.31', '192.168.10.169', '192.168.10.116'] 

Finally, download the CIFAR10 datasets manually and put them into the datasets/CIFAR10 folder (python version).

To test the code:

Launch FedFly central server

python FedFly_serverrun.py --offload True #FedFly training

Launch FedFly source edge server

python FedFly_serverrun.py --offload True #FedFly training

Launch FedFly destination edge server

python FedFly_serverrun.py --offload True #FedFly training

Launch FedFly devices

python FedFly_clientrun.py --offload True #FedFly training

Citation

Please cite the paper as follows: Rehmat Ullah, Di Wu, Paul Harvey, Peter Kilpatrick, Ivor Spence and Blesson Varghese, "FedFly: Towards Migration in Edge-based Distributed Federated Learning", 2021.

@misc{ullah2021fedfly,
      title={FedFly: Towards Migration in Edge-based Distributed Federated Learning}, 
      author={Rehmat Ullah and Di Wu and Paul Harvey and Peter Kilpatrick and Ivor Spence and Blesson Varghese},
      year={2021},
      eprint={2111.01516},
      archivePrefix={arXiv},
      primaryClass={cs.DC}
}
Owner
qub-blesson
qub-blesson
Research into Forex price prediction from price history using Deep Sequence Modeling with Stacked LSTMs.

Forex Data Prediction via Recurrent Neural Network Deep Sequence Modeling Research Paper Our research paper can be viewed here Installation Clone the

Alex Taradachuk 2 Aug 07, 2022
The repository includes the code for training cell counting applications. (Keras + Tensorflow)

cell_counting_v2 The repository includes the code for training cell counting applications. (Keras + Tensorflow) Dataset can be downloaded here : http:

Weidi 113 Oct 06, 2022
This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

GPlearn_finiance_stock_futures_extension This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector

Chengwei <a href=[email protected]"> 189 Dec 25, 2022
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
BackgroundRemover lets you Remove Background from images and video with a simple command line interface

BackgroundRemover BackgroundRemover is a command line tool to remove background from video and image, made by nadermx to power https://BackgroundRemov

Johnathan Nader 1.7k Dec 30, 2022
An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

Transformer-in-Transformer An Implementation of the Transformer in Transformer paper by Han et al. for image classification, attention inside local pa

Rishit Dagli 40 Jul 25, 2022
MlTr: Multi-label Classification with Transformer

MlTr: Multi-label Classification with Transformer This is official implement of "MlTr: Multi-label Classification with Transformer". Abstract The task

程星 38 Nov 08, 2022
gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images

gACSON gACSON software is to visualize, segment, and analyze the morphology of neurons in three-dimensional electron microscopy images. If you use any

Andrea Behanova 2 May 31, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

郝翔 357 Jan 04, 2023
Convenient tool for speeding up the intern/officer review process.

icpc-app-screen Convenient tool for speeding up the intern/officer applicant review process. Eliminates the pain from reading application responses of

1 Oct 30, 2021
Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

JAX + Attention Learn To Solve Routing Problems Reinplementation of the paper Attention, Learn to Solve Routing Problems! using Jax and Flax. Fully su

Gabriela Surita 7 Dec 01, 2022
A Traffic Sign Recognition Project which can help the driver recognise the signs via text as well as audio. Can be used at Night also.

Traffic-Sign-Recognition In this report, we propose a Convolutional Neural Network(CNN) for traffic sign classification that achieves outstanding perf

Mini Project 64 Nov 19, 2022
Reproduce partial features of DeePMD-kit using PyTorch.

DeePMD-kit on PyTorch For better understand DeePMD-kit, we implement its partial features using PyTorch and expose interface consuing descriptors. Tec

Shaochen Shi 8 Dec 17, 2022
This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

PyTorch Infer Utils This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model infer

Alex Gorodnitskiy 11 Mar 20, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

PyG 16.5k Jan 08, 2023
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

El Bruno 3 Mar 30, 2022
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

NVIDIA Corporation 1.8k Dec 30, 2022
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" ([email protected])

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022