Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

Last update: Dec 28, 2022

Related tags

Text Data & NLP recipes

Overview

torchrecipes

This library is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open an github issue or reach out. We'd love to hear about how you're using torchrecipes.

torchrecipes is a prototype is built on top of PyTORCH and provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

It aims to provide reproduci-able "applications" built on top of PyTorch with good performance and easy reproduciability. Because this project builds on the pytorch ecosystem and requires significant investment, we'd love to hear from and work with early adopters to shape the design. Please reach out on the issue tracker if you're interested in using this for your project.

Why `torchrecipes`?

The primary goal of the torchrecipes is to 10x ML development by providing standard blueprints to easily train production-ready ML models across environemnts (from local development to cluster deployment).

Requirements

PyTorch Recipes (torchrecipes):

python3 (3.8+)
torch

Running

The easiest way to run torchrecipes is to use torchx. You can install it directly (if not already included as part of our requirements.txt) with:

pip install torchx

Then go to torchrecipes/launcher/ and create a file torchx_app.py:

specs.AppDef: return specs.AppDef( name="run", roles=[ specs.Role( name="run", image=image, entrypoint="python", args=[*image_classification_args, *job_args], env={ "CONFIG_MODULE": "torchrecipes.vision.image_classification.conf", "MODE": "prod", "HYDRA_FULL_ERROR": "1", } ) ], ) ">

# 'torchrecipes/launcher/torchx_app.py'

import torchx.specs as specs

image_classification_args = [
    "-m", "run",
    "--config-name",
    "train_app",
    "--config-path",
    "torchrecipes/vision/image_classification/conf",
]

def torchx_app(image: str = "run.py:latest", *job_args: str) -> specs.AppDef:
    return specs.AppDef(
        name="run",
        roles=[
            specs.Role(
                name="run",
                image=image,
                entrypoint="python",
                args=[*image_classification_args, *job_args],
                env={
                    "CONFIG_MODULE": "torchrecipes.vision.image_classification.conf",
                    "MODE": "prod",
                    "HYDRA_FULL_ERROR": "1",
                }
            )
        ],
    )

This app defines the entrypoint, args and image for launching.

Now that we have created a torchx app, we are (almost) ready for launching a job!

Firstly, create a symlink for launcher/run.py at the top level of the repo:

ln -s torchrecipes/launcher/run.py ./run.py

Then we are ready-to-go! Simply launch the image_classification recipe with the following command:

torchx run --scheduler local_cwd torchrecipes/launcher/torchx_app.py:torchx_app trainer.fast_dev_run=True trainer.checkpoint_callback=False +tb_save_dir=/tmp/

Release

# install torchrecipes
pip install torchrecipes

Contributing

We welcome PRs! See the CONTRIBUTING file.

License

torchrecipes is BSD licensed, as found in the LICENSE file.

Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

Related tags

Overview

torchrecipes

Why `torchrecipes`?

Requirements

Running

Release

Contributing

License

Owner

Meta Research

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Unsupervised Abstract Reasoning for Raven’s Problem Matrices

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA

Code for the paper "Flexible Generation of Natural Language Deductions"

Uncomplete archive of files from the European Nopsled Team

SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley)

BiNE: Bipartite Network Embedding

A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.

TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

Beyond Paragraphs: NLP for Long Sequences

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

Study German declensions (dER nettE Mann, ein nettER Mann, mit dEM nettEN Mann, ohne dEN nettEN Mann ...) Generate as many exercises as you want using the incredible power of SPACY!

Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

Related tags

Overview

torchrecipes

Why torchrecipes?

Requirements

Running

Release

Contributing

License

Owner

Meta Research

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Unsupervised Abstract Reasoning for Raven’s Problem Matrices

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA

Code for the paper "Flexible Generation of Natural Language Deductions"

Uncomplete archive of files from the European Nopsled Team

SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley)

BiNE: Bipartite Network Embedding

A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.

TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

Beyond Paragraphs: NLP for Long Sequences

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

Study German declensions (dER nettE Mann, ein nettER Mann, mit dEM nettEN Mann, ohne dEN nettEN Mann ...) Generate as many exercises as you want using the incredible power of SPACY!

Why `torchrecipes`?