WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Last update: Oct 28, 2022

Related tags

Overview

Paper "Improving image captioning with better use of captions"

@inproceedings{shi2020improving,
  title={Improving Image Captioning with Better Use of Caption},
  author={Shi, Zhan and Zhou, Xu and Qiu, Xipeng and Zhu, Xiaodan},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  pages={7454--7464},
  year={2020}
}

Requirements

python 2.7.15

torch 1.0.1

Specific conda env is shown in ezs.yml

BTW, you need to download coco-captions and cider folder in this directory for evaluation.

Data Files and Models

Files: Add files in data directory in google drive or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) to data directory here. See data/README for more details.

Models: Add log directory in google drive or or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) here.

Scripts

MLE training:

python train.py --gpus 0 --id experiment-mle

RL training

python train.py --gpus 0 --id experiment-rl --learning_rate 2e-5 --resume_from experiment-mle --resume_from_best True --self_critical_after 0 --max_epochs 60 --learning_rate_decay_start -1 --scheduled_sampling_start -1 --reduce_on_plateau

Evaluate your own model or Load trained model:

python eval.py --gpus 0 --resume_from experiment-mle

and

python eval.py --gpus 0 --resume_from experiment-rl

Acknowledgement

This code is based on Ruotian Luo's brilliant image captioning repo ruotianluo/self-critical.pytorch. We use the detected bounding boxes/categories/features provided by Bottom-Up peteanderson80/bottom-up-attention, yangxuntu/SGAE. Many thanks for their work!

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Related tags

Overview

Paper "Improving image captioning with better use of captions"

Requirements

Data Files and Models

Scripts

Acknowledgement

Owner

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Multi-Scale Progressive Fusion Network for Single Image Deraining

the code of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021)

Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

GPU implementation of $k$-Nearest Neighbors and Shared-Nearest Neighbors

A Strong Baseline for Image Semantic Segmentation

Image segmentation with private İstanbul Dataset

Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection

PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

The Noise Contrastive Estimation for softmax output written in Pytorch

Si Adek Keras is software VR dangerous object detection.

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Using VideoBERT to tackle video prediction

Time-Optimal Planning for Quadrotor Waypoint Flight

Learning to Self-Train for Semi-Supervised Few-Shot

MVP Benchmark for Multi-View Partial Point Cloud Completion and Registration

Wide Residual Networks (WideResNets) in PyTorch

Code repository for our paper regarding the L3D dataset.

This is the reference implementation for "Coresets via Bilevel Optimization for Continual Learning and Streaming"