Hooks for VCOCO

Last update: Nov 24, 2022

Related tags

Overview

Verbs in COCO (V-COCO) Dataset

This repository hosts the Verbs in COCO (V-COCO) dataset and associated code to evaluate models for the Visual Semantic Role Labeling (VSRL) task as ddescribed in this technical report.

Citing

If you find this dataset or code base useful in your research, please consider citing the following papers:

@article{gupta2015visual,
  title={Visual Semantic Role Labeling},
  author={Gupta, Saurabh and Malik, Jitendra},
  journal={arXiv preprint arXiv:1505.04474},
  year={2015}
}

@incollection{lin2014microsoft,
  title={Microsoft COCO: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={Computer Vision--ECCV 2014},
  pages={740--755},
  year={2014},
  publisher={Springer}
}

Installation

Clone repository (recursively, so as to include COCO API).

git clone --recursive https://github.com/s-gupta/v-coco.git

This dataset builds off MS COCO, please download MS-COCO images and annotations.
Current V-COCO release only uses a subset of MS-COCO images (Image IDs listed in data/splits/vcoco_all.ids). Use the following script to pick out annotations from the COCO annotations to allow faster loading in V-COCO.
```
# Assume you cloned the repository to `VCOCO_DIR'
cd $VCOCO_DIR
# If you downloaded coco annotations to coco-data/annotations
python script_pick_annotations.py coco-data/annotations
```

Build coco/PythonAPI/pycocotools/_mask.so, cython_bbox.so.

# Assume you cloned the repository to `VCOCO_DIR'
cd $VCOCO_DIR/coco/PythonAPI/ && make
cd $VCOCO_DIR && make

Using the dataset

An IPython notebook, illustrating how to use the annotations in the dataset is available in V-COCO.ipynb
The current release of the dataset includes annotations as indicated in Table 1 in the paper. We are collecting role annotations for the 6 categories (that are missing) and will make them public shortly.

Evaluation

We provide evaluation code that computes agent AP and role AP, as explained in the paper.

In order to use the evaluation code, store your predictions as a pickle file (.pkl) in the following format:

[ {'image_id':        # the coco image id,
   'person_box':      #[x1, y1, x2, y2] the box prediction for the person,
   '[action]_agent':  # the score for action corresponding to the person prediction,
   '[action]_[role]': # [x1, y1, x2, y2, s], the predicted box for role and 
                      # associated score for the action-role pair.
   } ]

Assuming your detections are stored in det_file=/path/to/detections/detections.pkl, do

from vsrl_eval import VCOCOeval
vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
  # e.g. vsrl_annot_file: data/vcoco/vcoco_val.json
  #      coco_file:       data/instances_vcoco_all_2014.json
  #      split_file:      data/splits/vcoco_val.ids
vcocoeval._do_eval(det_file, ovr_thresh=0.5)

We introduce two scenarios for role AP evaluation.

[Scenario 1] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 & the corresponding role is empty e.g. [0,0,0,0] or [NaN,NaN,NaN,NaN]. This scenario is fit for missing roles due to occlusion.
[Scenario 2] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 (the corresponding role is ignored). This scenario is fit for the cases with roles outside the COCO categories.

Hooks for VCOCO

Related tags

Overview

Verbs in COCO (V-COCO) Dataset

Citing

Installation

Using the dataset

Evaluation

Owner

Saurabh Gupta

Repository of Vision Transformer with Deformable Attention

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

ParaGen is a PyTorch deep learning framework for parallel sequence generation

Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Large dataset storage format for Pytorch

OMAMO: orthology-based model organism selection

Code for classifying international patents based on the text of their titles/abstracts

Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"

PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images

Lucid library adapted for PyTorch

AISTATS 2019: Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

A modern pure-Python library for reading PDF files

[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

An official source code for "Augmentation-Free Self-Supervised Learning on Graphs"

One-line your code easily but still with the fun of doing so!

MoCoGAN: Decomposing Motion and Content for Video Generation

A deep-learning pipeline for segmentation of ambiguous microscopic images.