An unreferenced image captioning metric (ACL-21)

Last update: Nov 20, 2022

Related tags

Overview

UMIC

This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning.
Here, we provide the code to compute UMIC.

Usage (Updating the Descriptions)

Our code is based on UNITER. Therefore, please follow the install guideline for using Docker to load UNITER. In the next few weeks, we try to release the version without using the docker.

1. Install Prerequisites

We used the Docker image provided by the official repo of UNITER. Using the guideline in the repo, please install the docker.

2. Download the Visual Features

For image captioning task, COCO dataset is widely used. To download the visual features for coco captions, just download the image features for coco validation splits using the following command.

wget https://acvrpublicycchen.blob.core.windows.net/uniter/img_db/coco_val2014.tar

Please refer to the offical repo of UNITER for downloading other visual features.

3. Pre-processing the Textual Features (Captions)

The format of textual feature file(python dictionary, json format) is as follows:
'cands' : [list of candidate captions]
'img_fs' : [list of image file names]

4. Running the Script

Launching Docker

source launch_activate.sh $PATH_TO_STORAGE

Compute Score

python compute_score.py --data_type capeval1k \
                              --ckpt /storage/umic.pt \
                              --img_type \ coco_val2014 \

Reference

If you find this repo useful, please consider citing:

@inproceedings{lee-etal-2021-umic,
    title = "{UMIC}: An Unreferenced Metric for Image Captioning via Contrastive Learning",
    author = "Lee, Hwanhee  and
      Yoon, Seunghyun  and
      Dernoncourt, Franck  and
      Bui, Trung  and
      Jung, Kyomin",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-short.29",
    doi = "10.18653/v1/2021.acl-short.29",
    pages = "220--226",
}

An unreferenced image captioning metric (ACL-21)

Related tags

Overview

UMIC

Usage (Updating the Descriptions)

1. Install Prerequisites

2. Download the Visual Features

3. Pre-processing the Textual Features (Captions)

4. Running the Script

Reference

Owner

hwanheelee

A PyTorch-based library for fast prototyping and sharing of deep neural network models.

SmallInitEmb - LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

Segment axon and myelin from microscopy data using deep learning

Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

Brain tumor detection using CNN (InceptionResNetV2 Model)

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

A simple python library for fast image generation of people who do not exist.

Forecasting with Gradient Boosted Time Series Decomposition

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

A library for uncertainty representation and training in neural networks.

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Implementation of the federated dual coordinate descent (FedDCD) method.

realsense d400 -> jpg + csv

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Generate high quality pictures. GAN. Generative Adversarial Networks