Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Last update: Sep 15, 2022

Related tags

Overview

Distributed Deep Learning in Open Collaborations

This repository contains the code for the NeurIPS 2021 paper

"Distributed Deep Learning in Open Collaborations"

Michael Diskin*, Alexey Bukhtiyarov*, Max Ryabinin*, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko

Link: ArXiv

Note

This repository contains a snapshot of the code used to conduct experiments in the paper.

Please use the up-to-date version of our library if you want to try out collaborative training and/or set up your own experiment. It contains many substantial improvements, including better documentation and fixed bugs.

Installation

Before running the experiments, please set up the environment by following the steps below:

Prepare an environment with python 3.7-3.9. Anaconda is recommended, but not required
Install the hivemind library from the master branch or by running pip install hivemind==0.9.9.post1

For all distributed experiments, the installation procedure must be repeated on every machine that participates in the experiment. We recommend using machines with at least 2 CPU cores, 16 GB RAM and, when applicable, a low/mid-tier NVIDIA GPU.

Experiments

The code is divided into several sections matching the corresponding experiments:

albert contains the code for controlled experiments with ALBERT-large on WikiText-103;
swav is for training SwAV on ImageNet data;
sahajbert contains the code used to conduct a public collaborative experiment for the Bengali language ALBERT;
p2p is a step-by-step tutorial that explains decentralized NAT traversal and circuit relays.

We recommend running albert experiments first: other experiments build on top of its code and may reqire more careful setup (e.g. for public participation). Furthermore, for this experiment, we provide a script for launching experiments using preemptible GPUs in the cloud.

Acknowledgements

This project is the result of a collaboration between Yandex, Hugging Face, MIPT, HSE University, University of Toronto, Vector Institute, and Neuropark.

We also thank Stas Bekman, Dmitry Abulkhanov, Roman Zhytar, Alexander Ploshkin, Vsevolod Plokhotnyuk and Roman Kail for their invaluable help with building the training infrastructure. Also, we thank Abhishek Thakur for helping with downstream evaluation and Tanmoy Sarkar with Omar Sanseviero, who helped us organize the collaborative experiment and gave regular status updates to the participants over the course of the training run.

Contacts

Feel free to ask any questions in our Discord chat or by email.

Citation

@inproceedings{diskin2021distributed,
    title = {Distributed Deep Learning In Open Collaborations},
    author = {Michael Diskin and Alexey Bukhtiyarov and Max Ryabinin and Lucile Saulnier and Quentin Lhoest and Anton Sinitsin and Dmitry Popov and Dmitriy Pyrkin and Maxim Kashirin and Alexander Borzunov and Albert Villanova del Moral and Denis Mazur and Ilia Kobelev and Yacine Jernite and Thomas Wolf and Gennady Pekhimenko},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
    year = {2021},
    url = {https://openreview.net/forum?id=FYHktcK-7v}
}

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Related tags

Overview

Distributed Deep Learning in Open Collaborations

Note

Installation

Experiments

Acknowledgements

Contacts

Citation

Owner

Yandex Research

Back to Basics: Efficient Network Compression via IMP

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Implementation of a Transformer using ReLA (Rectified Linear Attention)

MaskTrackRCNN for video instance segmentation based on mmdetection

NumQMBasic - A mini-course offered to Undergrad physics students

Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation.

We propose a new method for effective shadow removal by regarding it as an exposure fusion problem.

nextPARS, a novel Illumina-based implementation of in-vitro parallel probing of RNA structures.

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

The code succinctly shows how our ensemble learning based on deep learning CNN is used for LAM-avulsion-diagnosis.

A configurable, tunable, and reproducible library for CTR prediction

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

LieTransformer: Equivariant Self-Attention for Lie Groups

Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

Use .csv files to record, play and evaluate motion capture data.

This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit

Tandem Mass Spectrum Prediction with Graph Transformers