Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

Overview

Lingvo

PyPI Python

Documentation

License

What is it?

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

A list of publications using Lingvo can be found here.

Table of Contents

Releases

PyPI Version Commit
0.10.0 075fd1d88fa6f92681f58a2383264337d0e737ee
0.9.1 c1124c5aa7af13d2dd2b6d43293c8ca6d022b008
0.9.0 f826e99803d1b51dccbbbed1ef857ba48a2bbefe
Older releases

PyPI Version Commit
0.8.2 93e123c6788e934e6b7b1fd85770371becf1e92e
0.7.2 b05642fe386ee79e0d88aa083565c9a93428519e

Details for older releases are unavailable.

Major breaking changes

NOTE: this is not a comprehensive list. Lingvo releases do not offer any guarantees regarding backwards compatibility.

HEAD

Nothing here.

0.10.0

  • General
    • The theta_fn arg to CreateVariable() has been removed.

0.9.1

  • General
    • Python 3.9 is now supported.
    • ops.beam_search_step now takes and returns an additional arg beam_done.
    • The namedtuple beam_search_helper.BeamSearchDecodeOutput now removes the field done_hyps.

0.9.0

  • General
    • Tensorflow 2.5 is now the required version.
    • Python 3.5 support has been removed.
    • py_utils.AddGlobalVN and py_utils.AddPerStepVN have been combined into py_utils.AddVN.
    • BaseSchedule().Value() no longer takes a step arg.
    • Classes deriving from BaseSchedule should implement Value() not FProp().
    • theta.global_step has been removed in favor of py_utils.GetGlobalStep().
    • py_utils.GenerateStepSeedPair() no longer takes a global_step arg.
    • PostTrainingStepUpdate() no longer takes a global_step arg.
    • The fatal_errors argument to custom input ops now takes error message substrings rather than integer error codes.
Older releases

0.8.2

  • General
    • NestedMap Flatten/Pack/Transform/Filter etc now expand descendent dicts as well.
    • Subclasses of BaseLayer extending from abc.ABCMeta should now extend base_layer.ABCLayerMeta instead.
    • Trying to call self.CreateChild outside of __init__ now raises an error.
    • base_layer.initializer has been removed. Subclasses no longer need to decorate their __init__ function.
    • Trying to call self.CreateVariable outside of __init__ or _CreateLayerVariables now raises an error.
    • It is no longer possible to access self.vars or self.theta inside of __init__. Refactor by moving the variable creation and access to _CreateLayerVariables. The variable scope is set automatically according to the layer name in _CreateLayerVariables.

Details for older releases are unavailable.

Quick start

Installation

There are two ways to set up Lingvo: installing a fixed version through pip, or cloning the repository and building it with bazel. Docker configurations are provided for each case.

If you would just like to use the framework as-is, it is easiest to just install it through pip. This makes it possible to develop and train custom models using a frozen version of the Lingvo framework. However, it is difficult to modify the framework code or implement new custom ops.

If you would like to develop the framework further and potentially contribute pull requests, you should avoid using pip and clone the repository instead.

pip:

The Lingvo pip package can be installed with pip3 install lingvo.

See the codelab for how to get started with the pip package.

From sources:

The prerequisites are:

  • a TensorFlow 2.6 installation,
  • a C++ compiler (only g++ 7.3 is officially supported), and
  • the bazel build system.

Refer to docker/dev.dockerfile for a set of working requirements.

git clone the repository, then use bazel to build and run targets directly. The python -m module commands in the codelab need to be mapped onto bazel run commands.

docker:

Docker configurations are available for both situations. Instructions can be found in the comments on the top of each file.

How to install docker.

Running the MNIST image model

Preparing the input data

pip:

mkdir -p /tmp/mnist
python3 -m lingvo.tools.keras2ckpt --dataset=mnist

bazel:

mkdir -p /tmp/mnist
bazel run -c opt //lingvo/tools:keras2ckpt -- --dataset=mnist

The following files will be created in /tmp/mnist:

  • mnist.data-00000-of-00001: 53MB.
  • mnist.index: 241 bytes.

Running the model

pip:

cd /tmp/mnist
curl -O https://raw.githubusercontent.com/tensorflow/lingvo/master/lingvo/tasks/image/params/mnist.py
python3 -m lingvo.trainer --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log

bazel:

(cpu) bazel build -c opt //lingvo:trainer
(gpu) bazel build -c opt --config=cuda //lingvo:trainer
bazel-bin/lingvo/trainer --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr

After about 20 seconds, the loss should drop below 0.3 and a checkpoint will be saved, like below. Kill the trainer with Ctrl+C.

trainer.py:518] step:   205, steps/sec: 11.64 ... loss:0.25747201 ...
checkpointer.py:115] Save checkpoint
checkpointer.py:117] Save checkpoint done: /tmp/mnist/log/train/ckpt-00000205

Some artifacts will be produced in /tmp/mnist/log/control:

  • params.txt: hyper-parameters.
  • model_analysis.txt: model sizes for each layer.
  • train.pbtxt: the training tf.GraphDef.
  • events.*: a tensorboard events file.

As well as in /tmp/mnist/log/train:

  • checkpoint: a text file containing information about the checkpoint files.
  • ckpt-*: the checkpoint files.

Now, let's evaluate the model on the "Test" dataset. In the normal training setup the trainer and evaler should be run at the same time as two separate processes.

pip:

python3 -m lingvo.trainer --job=evaler_test --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log

bazel:

bazel-bin/lingvo/trainer --job=evaler_test --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr

Kill the job with Ctrl+C when it starts waiting for a new checkpoint.

base_runner.py:177] No new check point is found: /tmp/mnist/log/train/ckpt-00000205

The evaluation accuracy can be found slightly earlier in the logs.

base_runner.py:111] eval_test: step:   205, acc5: 0.99775392, accuracy: 0.94150388, ..., loss: 0.20770954, ...

Running the machine translation model

To run a more elaborate model, you'll need a cluster with GPUs. Please refer to third_party/py/lingvo/tasks/mt/README.md for more information.

Running the GShard transformer based giant language model

To train a GShard language model with one trillion parameters on GCP using CloudTPUs v3-512 using 512-way model parallelism, please refer to third_party/py/lingvo/tasks/lm/README.md for more information.

Running the 3d object detection model

To run the StarNet model using CloudTPUs on GCP, please refer to third_party/py/lingvo/tasks/car/README.md.

Models

Automatic Speech Recognition

Car

Image

Language Modelling

Machine Translation

References

Please cite this paper when referencing Lingvo.

@misc{shen2019lingvo,
    title={Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling},
    author={Jonathan Shen and Patrick Nguyen and Yonghui Wu and Zhifeng Chen and others},
    year={2019},
    eprint={1902.08295},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

License

Apache License 2.0

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
Mememoji - A facial expression classification system that recognizes 6 basic emotions: happy, sad, surprise, fear, anger and neutral.

a project built with deep convolutional neural network and ❤️ Table of Contents Motivation The Database The Model 3.1 Input Layer 3.2 Convolutional La

Jostine Ho 761 Dec 05, 2022
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

WSDEC This is the official repo for our NeurIPS paper Weakly Supervised Dense Event Captioning in Videos. Description Repo directories ./: global conf

Melon(Xuguang Duan) 96 Nov 01, 2022
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 09, 2022
Artificial Intelligence playing minesweeper 🤖

AI playing Minesweeper ✨ Minesweeper is a single-player puzzle video game. The objective of the game is to clear a rectangular board containing hidden

Vaibhaw 8 Oct 17, 2022
Eth brownie struct encoding example

eth-brownie struct encoding example Overview This repository contains an example of encoding a struct, so that it can be used in a function call, usin

Ittai Svidler 2 Mar 04, 2022
Project repo for the paper SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition

SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition (BMVC 2021) Project repo for the paper SILT: Self-supervised Lighting Trans

6 Dec 04, 2022
Source code for our paper "Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures"

Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Code for the Multiplex Molecular Graph Neural Network (M

shzhang 59 Dec 10, 2022
NBEATSx: Neural basis expansion analysis with exogenous variables

NBEATSx: Neural basis expansion analysis with exogenous variables We extend the NBEATS model to incorporate exogenous factors. The resulting method, c

Cristian Challu 100 Dec 31, 2022
Code for Paper: Self-supervised Learning of Motion Capture

Self-supervised Learning of Motion Capture This is code for the paper: Hsiao-Yu Fish Tung, Hsiao-Wei Tung, Ersin Yumer, Katerina Fragkiadaki, Self-sup

Hsiao-Yu Fish Tung 87 Jul 25, 2022
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022
Image Completion with Deep Learning in TensorFlow

Image Completion with Deep Learning in TensorFlow See my blog post for more details and usage instructions. This repository implements Raymond Yeh and

Brandon Amos 1.3k Dec 23, 2022
Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch; pre-processing and post-processing using numpy instead of pytroch.

炼丹去了 21 Dec 12, 2022
Local-Global Stratified Transformer for Efficient Video Recognition

DualFormer This repo is the implementation of our manuscript entitled "Local-Global Stratified Transformer for Efficient Video Recognition". Our model

Sea AI Lab 19 Dec 07, 2022
GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification This is the official pytorch implementation of t

Alibaba Cloud 5 Nov 14, 2022
Python3 / PyTorch implementation of the following paper: Fine-grained Semantics-aware Representation Enhancement for Self-supervisedMonocular Depth Estimation. ICCV 2021 (oral)

FSRE-Depth This is a Python3 / PyTorch implementation of FSRE-Depth, as described in the following paper: Fine-grained Semantics-aware Representation

77 Dec 28, 2022
An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

Hans Brouwer 33 Jan 05, 2023
FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

FinRL-Meta: A Universe of Market Environments. FinRL-Meta is a universe of market environments for data-driven financial reinforcement learning. Users

AI4Finance Foundation 543 Jan 08, 2023
Easy to use Audio Tagging in PyTorch

Audio Classification, Tagging & Sound Event Detection in PyTorch Progress: Fine-tune on audio classification Fine-tune on audio tagging Fine-tune on s

sithu3 15 Dec 22, 2022