Code and training data for our ECCV 2016 paper on Unsupervised Learning

Overview

Shuffle and Learn (Shuffle Tuple)

Created by Ishan Misra

Based on the ECCV 2016 Paper - "Shuffle and Learn: Unsupervised Learning using Temporal Order Verification" link to paper.

This codebase contains the model and training data from our paper.

Introduction

Our code base is a mix of Python and C++ and uses the Caffe framework. Design decisions and some code is derived from the Fast-RCNN codebase by Ross Girshick.

Citing

If you find our code useful in your research, please consider citing:

@inproceedings{misra2016unsupervised,
  title={{Shuffle and Learn: Unsupervised Learning using Temporal Order Verification}},
  author={Misra, Ishan and Zitnick, C. Lawrence and Hebert, Martial},
  booktitle={ECCV},
  year={2016}
}

Benchmark Results

We summarize the results of finetuning our method here (details in the paper).

Action Recognition

| Dataset | Accuracy (split 1) | Accuracy (mean over splits) :--- | :--- | :--- | :--- UCF101 | 50.9 | 50.2 HMDB51 | 19.8 | 18.1

Pascal Action Classification (VOC2012): Coming soon

Pose estimation

  • FLIC: PCK (Mean, AUC) 84.7, 49.6
  • MPII: [email protected] (Upper, Full, AUC): 87.7, 85.8, 47.6

Object Detection

  • PASCAL VOC2007 test mAP of 42.4% using Fast RCNN.

We initialize conv1-5 using our unsupervised pre-training. We initialize fc6-8 randomly. We then follow the procedure from Krahenbuhl et al., 2016 to rescale our network and finetune all layers using their hyperparameters.

Surface Normal Prediction

  • NYUv2 (Coming soon)

Contents

  1. Requirements: software
  2. Models and Training Data
  3. Usage
  4. Utils

Requirements: software

  1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers and OpenCV.

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
USE_OPENCV := 1

You can download a compatible fork of Caffe from here. Note that since our model requires Batch Normalization, you will need to have a fairly recent fork of caffe.

Models and Training Data

  1. Our model trained on tuples from UCF101 (train split 1, without using action labels) can be downloaded here.

  2. The tuples used for training our model can be downloaded as a zipped text file here. Each line of the file train01_image_keys.txt defines a tuple of three frames. The corresponding file train01_image_labs.txt has a binary label indicating whether the tuple is in the correct or incorrect order.

  3. Using the training tuples requires you to have the raw videos from the UCF101 dataset (link to videos). We extract frames from the videos and resize them such that the max dimension is 340 pixels. You can use ffmpeg to extract the frames. Example command: ffmpeg -i <video_name> -qscale 1 -f image2 <video_sub_name>/<video_sub_name>_%06d.jpg, where video_sub_name is the name of the raw video without the file extension.

Usage

  1. Once you have downloaded and formatted the UCF101 videos, you can use the networks/tuple_train.prototxt file to train your network. The only complicated part in the network definition is the data layer, which reads a tuple and a label. The data layer source file is in the python_layers subdirectory. Make sure to add this to your PYTHONPATH.
  2. Training for Action Recognition: We used the codebase from here
  3. Training for Pose Estimation: We used the codebase from here. Since this code does not use caffe for training a network, I have included a experimental data layer for caffe in python_layers/pose_data_layer.py

Utils

This repo also includes a bunch of utilities I used for training and debugging my models

  • python_layers/loss_tracking_layer: This layer tracks loss of each individual data point and its class label. This is useful for debugging as one can see the loss per class across epochs. Thanks to Abhinav Shrivastava for discussions on this.
  • model_training_utils: This is the wrapper code used to train the network if one wants to use the loss_tracking layer. These utilities not only track the loss, but also keep a log of various other statistics of the network - weights of the layers, norms of the weights, magnitude of change etc. For an example of how to use this check networks/tuple_exp.py. Thanks to Carl Doersch for discussions on this.
  • python_layers/multiple_image_multiple_label_data_layer: This is a fairly generic data layer that can read multiple images and data. It is based off my data layers repo.
Owner
Ishan Misra
Ishan Misra
AutoML library for deep learning

Official Website: autokeras.com AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras

Keras 8.7k Jan 08, 2023
Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition - NeurIPS2021

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition Project Page | Video | Paper Implementation for Neural-PIL. A novel method wh

Computergraphics (University of Tübingen) 64 Dec 29, 2022
Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

BlockGAN Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images BlockGAN: Learning 3D Object-aware Scene Rep

41 May 18, 2022
Efficient semidefinite bounds for multi-label discrete graphical models.

Low rank solvers #################################### benchmark/ : folder with the random instances used in the paper. ############################

1 Dec 08, 2022
Hyper-parameter optimization for sklearn

hyperopt-sklearn Hyperopt-sklearn is Hyperopt-based model selection among machine learning algorithms in scikit-learn. See how to use hyperopt-sklearn

1.4k Jan 01, 2023
IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

IJON SPACE EXPLORER IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL. Using only a small (usually one line) annotati

Chair for Sys­tems Se­cu­ri­ty 146 Dec 16, 2022
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Code of Classification Saliency-Based Rule for Visible and Infrared Image Fusion

CSF Code of Classification Saliency-Based Rule for Visible and Infrared Image Fusion Tips: For testing: CUDA_VISIBLE_DEVICES=0 python main.py For trai

Han Xu 14 Oct 31, 2022
Self Governing Neural Networks (SGNN): the Projection Layer

Self Governing Neural Networks (SGNN): the Projection Layer A SGNN's word projections preprocessing pipeline in scikit-learn In this notebook, we'll u

Guillaume Chevalier 22 Nov 06, 2022
Towards Long-Form Video Understanding

Towards Long-Form Video Understanding Chao-Yuan Wu, Philipp Krähenbühl, CVPR 2021 [Paper] [Project Page] [Dataset] Citation @inproceedings{lvu2021,

Chao-Yuan Wu 69 Dec 26, 2022
Contextual Attention Localization for Offline Handwritten Text Recognition

CALText This repository contains the source code for CALText model introduced in "CALText: Contextual Attention Localization for Offline Handwritten T

0 Feb 17, 2022
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

DiffWave DiffWave is a fast, high-quality neural vocoder and waveform synthesizer. It starts with Gaussian noise and converts it into speech via itera

LMNT 498 Jan 03, 2023
Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models

Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models This repo contains a barebones implementation for the atta

16 Dec 04, 2022
Jiminy Cricket Environment (NeurIPS 2021)

Jiminy Cricket This is the repository for "What Would Jiminy Cricket Do? Towards Agents That Behave Morally" by Dan Hendrycks*, Mantas Mazeika*, Andy

Dan Hendrycks 15 Aug 29, 2022
Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

Max Sklar 99 Dec 14, 2022
Twin-deep neural network for semi-supervised learning of materials properties

Deep Semi-Supervised Teacher-Student Material Synthesizability Prediction Citation: Semi-supervised teacher-student deep neural network for materials

MLEG 3 Dec 14, 2022
Library for 8-bit optimizers and quantization routines.

bitsandbytes Bitsandbytes is a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers and quantization functions. Paper -- V

Facebook Research 687 Jan 04, 2023
Motion planning algorithms commonly used on autonomous vehicles. (path planning + path tracking)

Overview This repository implemented some common motion planners used on autonomous vehicles, including Hybrid A* Planner Frenet Optimal Trajectory Hi

Huiming Zhou 1k Jan 09, 2023
competitions-v2

Codabench (formerly Codalab Competitions v2) Installation $ cp .env_sample .env $ docker-compose up -d $ docker-compose exec django ./manage.py migrat

CodaLab 21 Dec 02, 2022