A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

Related tags

Deep Learningbrave
Overview

BraVe

This is a JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

The model provided in this package was implemented based on the internal model that was used to compute results for the accompanying paper. It achieves comparable results on the evaluation tasks when evaluated side-by-side. Not all details are guaranteed to be identical though, and some results may differ from those given in the paper. In particular, this implementation does not provide the option to train with optical flow.

We provide a selection of pretrained checkpoints in the table below, which can directly be evaluated against HMDB 51 with the evaluation tools this package. These are exactly the checkpoints that were used to provide the numbers in the accompanying paper, and were not trained with the exact trainer given in this package. For details on training a model with this package, please see the end of this readme.

In the table below, the different configurations are represented by using e.g. V/A for video (narrow view) to audio (broad view), or V/F for a narrow view containing video, and a broad view containing optical flow.

The backbone in each case is TSMResnet, with a given width multiplier (please see the accompanying paper for further details). For all of the given numbers below, the SVM regularization constant used is 0.0001. For HMDB 51, the average is given in brackets, followed by the top-1 percentages for each of the splits.

Views Architecture HMDB51 UCF-101 K600 Trained with this package Checkpoint
V/AF TSM (1X) (69.2%) 71.307%, 68.497%, 67.843% 92.9% 69.2% download
V/AF TSM (2X) (69.9%) 72.157%, 68.432%, 69.02% 93.2% 70.2% download
V/A TSM (1X) (69.4%) 70.131%, 68.889%, 69.085% 93.0% 70.6% download
V/VVV TSM (1X) (65.4%) 66.797%, 63.856%, 65.425% 92.6% 70.8% download

Reproducing results from the paper

This package provides everything needed to evaluate the above checkpoints against HMDB 51. It supports Python 3.7 and above.

To get started, we recommend using a clean virtualenv. You may then install the brave package directly from GitHub using,

pip install git+https://github.com/deepmind/brave.git

A pre-processed version of the HMDB 51 dataset can be downloaded using the following command. It requires that both ffmpeg and unrar are available. The following will download the dataset to /tmp/hmdb51/, but any other location would also work.

  python -m brave.download_hmdb --output_dir /tmp/hmdb51/

To evaluate a checkpoint downloaded from the above table, the following may be used. The dataset shards arguments should be set to match the paths used above.

  python -m brave.evaluate_video_embeddings \
    --checkpoint_path <path/to/downloaded/checkpoint>.npy \
    --train_dataset_shards '/tmp/hmdb51/split_1/train/*' \
    --test_dataset_shards '/tmp/hmdb51/split_1/test/*' \
    --svm_regularization 0.0001 \
    --batch_size 8

Note that any of the three splits can be evaluated by changing the dataset split paths. To run this efficiently using a GPU, it is also necessary to install the correct version of jaxlib. To install jaxlib with support for cuda 10.1 on linux, the following install should be sufficient, though other precompiled packages may be found through the JAX documentation.

  pip install https://storage.googleapis.com/jax-releases/cuda101/jaxlib-0.1.69+cuda101-cp39-none-manylinux2010_x86_64.whl

Depending on the available GPU memory available, the batch_size parameter may be tuned to obtain better performance, or to reduce the required GPU memory.

Training a network

This package may also be used to train a model from scratch using jaxline. In order to try this, first ensure the configuration is set appropriately by modifying brave/config.py. At minimum, it would also be necessary to choose an appropriate global batch size (by default, the setting of 512 is likely too large for any single-machine training setup). In addition, a value must be set for dataset_shards. This should contain the paths of the tfrecord files containing the serialized training data.

For details on checkpointing and distributing computation, see the jaxline documentation.

Similarly to above, it is necessary to install the correct jaxlib package to enable training on a GPU.

The training may now be launched using,

  python -m brave.experiment --config=brave/config.py

Training datasets

This model is able to read data stored in the format specified by DMVR. For an example of writing training data in the correct format see the code in dataset/fixtures.py, which is used to write the test fixtures used in the tests for this package.

Running the tests

After checking out this code locally, you may run the package tests using

  pip install -e .
  pytest brave

We recommend doing this from a clean virtual environment.

Citing this work

If you use this code (or any derived code), data or these models in your work, please cite the relevant accompanying paper.

@misc{recasens2021broaden,
      title={Broaden Your Views for Self-Supervised Video Learning},
      author={Adrià Recasens and Pauline Luc and Jean-Baptiste Alayrac and Luyu Wang and Ross Hemsley and Florian Strub and Corentin Tallec and Mateusz Malinowski and Viorica Patraucean and Florent Altché and Michal Valko and Jean-Bastien Grill and Aäron van den Oord and Andrew Zisserman},
      year={2021},
      eprint={2103.16559},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Disclaimer

This is not an official Google product

Owner
DeepMind
DeepMind
Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Diverse Object-Scene Compositions For Zero-Shot Action Recognition This repository contains the source code for the use of object-scene compositions f

7 Sep 21, 2022
Contrastive Fact Verification

VitaminC This repository contains the dataset and models for the NAACL 2021 paper: Get Your Vitamin C! Robust Fact Verification with Contrastive Evide

47 Dec 19, 2022
A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

MADGRAD Optimization Algorithm For Tensorflow This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized,

20 Aug 18, 2022
Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

0 May 06, 2022
Repository for tackling Kaggle Ultrasound Nerve Segmentation challenge using Torchnet.

Ultrasound Nerve Segmentation Challenge using Torchnet This repository acts as a starting point for someone who wants to start with the kaggle ultraso

Qure.ai 46 Jul 18, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
An easy-to-use app to visualise attentions of various VQA models.

Ask Me Anything: A tool for visualising Visual Question Answering (AMA) An easy-to-use app to visualise attentions of various VQA models. Please click

Apoorve 37 Nov 13, 2022
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
Immortal tracker

Immortal_tracker Prerequisite Our code is tested for Python 3.6. To install required liabraries: pip install -r requirements.txt Waymo Open Dataset P

74 Dec 03, 2022
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Do you want a RL agent nicely moving on Atari? Rainbow is all you need! This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains bo

Jinwoo Park (Curt) 1.4k Dec 29, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

Liumeo 25 Dec 21, 2022
Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Attention Transfer PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Tran

Sergey Zagoruyko 1.4k Dec 23, 2022
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor.

Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor. It is devel

33 Nov 11, 2022
🔅 Shapash makes Machine Learning models transparent and understandable by everyone

🎉 What's new ? Version New Feature Description Tutorial 1.6.x Explainability Quality Metrics To help increase confidence in explainability methods, y

MAIF 2.1k Dec 27, 2022
Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Dual super-resolution learning for semantic segmentation 2021-01-02 Subpixel Update Happy new year! The 2020-12-29 update of SISR with subpixel conv p

Sam 79 Nov 24, 2022
Rate-limit-semaphore - Semaphore implementation with rate limit restriction for async-style (any core)

Rate Limit Semaphore Rate limit semaphore for async-style (any core) There are t

Yan Kurbatov 4 Jun 21, 2022
Toward Multimodal Image-to-Image Translation

BicycleGAN Project Page | Paper | Video Pytorch implementation for multimodal image-to-image translation. For example, given the same night image, our

Jun-Yan Zhu 1.4k Dec 22, 2022
Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash through feeding it pictures or videos.

Trash-Sorter-Extraordinaire Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash

Rameen Mahmood 1 Nov 07, 2021