Yet another video caption

Last update: May 26, 2022

Related tags

Deep Learning yet-another-video-caption

Overview

yet-another-video-caption

数据集配置

准备数据集

将原始数据集重新组织成统一的格式后，放置于 ./dataset 中。

数据集的组织格式为：

./dataset
    train/
        video/
            *.avi
        ...
        info.json
    test/
        video/ 
            *.avi
        ...

自动配置

通常你只需要使用数据集的一个子集，此时请考虑运行自动抽取脚本 makedata.py。

所有数据位于 ./data 中。

所有视频（包括 train/val/test）位于 ./data/video 中。

所有视频信息（包括 train/val/test）输入到 ./data/input.json。

程序会在 ./data 中产生一些中间信息，请勿修改。

依赖

pip install tqdm pillow pretrainedmodels nltk

此外，请确保已当前环境下已经正确配置 CUDA 运行库，CUDNN，Pytorch(GPU)，ffmpeg，JDK

食用步骤

确保数据集已正确配置
确保依赖已经正确安装
抽取数据，将你希望使用的 train/val/test 划分参数输入 makedata.py 中，然后执行该脚本
依次执行（请自行修改 batch_size 和 saved_model 参数！）

python prepro_feats.py --output_dir data/feats/resnet152 --model resnet152
python prepro_vocab.py
python train.py --epochs 3001 --batch_size 1 --checkpoint_path data/save --feats_dir data/feats/resnet152 --model S2VTAttModel --with_c3d 0 --dim_vid 2048
python eval.py --recover_opt data/save/opt_info.json --saved_model data/save/model_10.pth --batch_size 1

速度测试

以下结果测试于单张 2080Ti

预处理（ResNet152 特征提取）：共 40min

训练速度（batch_size=32）：6.20 it/s

Todo

大小写问题

References

https://github.com/xiadingZ/video-caption.pytorch

Yet another video caption

Related tags

Overview

yet-another-video-caption

数据集配置

准备数据集

自动配置

依赖

食用步骤

速度测试

Todo

References

Owner

Fan Zhimin

MT-GAN-PyTorch - PyTorch Implementation of Learning to Transfer: Unsupervised Domain Translation via Meta-Learning

Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

Pytorch Implementation of LNSNet for Superpixel Segmentation

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

Intrusion Detection System using ensemble learning (machine learning)

PyTorch implementation of "VRT: A Video Restoration Transformer"

The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal Transport Maps, ICLR 2022.

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Put blind watermark into a text with python

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Meli Data Challenge 2021 - First Place Solution

Dashboard for the COVID19 spread

[ACM MM 2019 Oral] Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

A toy project using OpenCV and PyMunk

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Gin provides a lightweight configuration framework for Python

Progressive Image Deraining Networks: A Better and Simpler Baseline

(Personalized) Page-Rank computation using PyTorch

The world's largest toxicity dataset.