On Anytime Learning At Macroscale

Learning from sequential data dumps

(key) Requirements

Python 3.7
Pytorch 1.9.0
Hydra 1.1.0 (pip install hydra-core & pip install hydra-submitit-launcher)

Structure

├── crlapi           
  ├── benchmark.py    # Creates the data stream, feeds it to the model and evaluates it
  ├── core.py         # Abstract classes for 
  ├── logger.py   
  ├── sl
    ├── architectures
      ├── ...         # NN architectures used in this project
    ├── clmodels
      ├── ...         # Models (e.g. Single, gEns, ..., )
    ├── streams
      ├── ...         # CIFAR and MNIST stream implementatins

Running Experiments

To run experiments, you need to call the dataset specific run file, and you need to pass the configuration of the run. We have place the configurations in the previous directory (../configs). The config structure is as follows

    ├── configs
        ├── mnist
           ├── run.py                 # run file
           ├── test_usage_gmoe.yaml   # This is the "gMoE" model
           ├── test_finetune_mlp.yaml # This is the "Single Model"
           ... 
        ├── cifar
           ├── run.py                 # run file
           ├── test_finetune_vgg.yaml # This is the "Single Model"
           ├── test_usage_gmoe.yaml   # This is the "gMoE" model
           ...

To run an e.g. mnist gMoE run, the command is (launched from the directory just above (so cd ..)

PYTHONPATH=./ python configs/mnist/run.py -cn test_usage_gmoe n_megabatches=2 replay=1 clmodel.max_epochs=200

Important arguments

n_megabatches : controls the number of megabatches. So n_megabatches=1 is your regular full dataset training
replay : whether to use replay or not
clmodel.init_from_scratch : whether to reinitialize the model at every MB. Should only be used when replay=1
device : use cuda or cpu depending on your hardware

License

alma is released under the MIT license. See LICENSE for additional details about it. See also our Terms of Use and Privacy Policy.

Anytime Learning At Macroscale

Related tags

Overview

On Anytime Learning At Macroscale

(key) Requirements

Structure

Running Experiments

Important arguments

License

Owner

Meta Research

Crypto-trading - ML techiques are used to forecast short term returns in 14 popular cryptocurrencies

Polyglot Machine Learning example for scraping similar news articles.

Open MLOps - A Production-focused Open-Source Machine Learning Framework

A flexible CTF contest platform for coming PKU GeekGame events

Python based GBDT implementation

DaCeML - Machine learning powered by data-centric parallel programming.

Data science, Data manipulation and Machine learning package.

Neural Machine Translation (NMT) tutorial with OpenNMT-py

A GitHub action that suggests type annotations for Python using machine learning.

Dual Adaptive Sampling for Machine Learning Interatomic potential.

In this Repo a simple Sklearn Model will be trained and pushed to MLFlow

A collection of machine learning examples and tutorials.

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

nn-Meter is a novel and efficient system to accurately predict the inference latency of DNN models on diverse edge devices

PROTEIN EXPRESSION ANALYSIS FOR DOWN SYNDROME

Continuously evaluated, functional, incremental, time-series forecasting

A Python toolkit for rule-based/unsupervised anomaly detection in time series

ML Kaggle Titanic Problem using LogisticRegrission

A data preprocessing and feature engineering script for a machine learning pipeline is prepared.

This machine learning model was developed for House Prices