STRIVE: Scene Text Replacement In Videos

Dataset Types:

RoboText
SynthText
RealWorld videos

RoboText : Videos of texts collected using navigation robot in indoor environment. The overall duration of these videos is 10hrs+ Each text's background can be extracted from the bottom rectangle of its text rectangle. The orginial unprocessed data is stored as RoboText-OriginalZip.7z. Around 200 preprocessed videos are stored as RoboTextZip1.7z

SynthText : Using unity, we have created paired videos from synthetic scenes. These videos are stored with similar naming convention in drive. File name : SynthText7Zip.7z

Note: Unity bbox are recorded as mirror values, hence the bbox extraction process will be different than other two video types.

Real World videos: We have collected videos using high resolution mobile camera to capture texts in different lighting conditions and motion blur. File name: RealWorld.7z

Preparing data

We have extracted text bounding box from RoboText and Real world videos using AWS Rekognition API. The code available as runAWS.py file. Synthetic videos bbox is recorded in unity environment

Data Preprocessing

Refer to the preprocessing python file for each dataset type to get crop images of text.

Data download

Data can be downloaded from here

Please contact Jeyasri Subramanian( [email protected] ) for any data queries

STRIVE: Scene Text Replacement In Videos

Related tags

Overview

STRIVE: Scene Text Replacement In Videos

Dataset Types:

Preparing data

Data Preprocessing

Data download

Owner

Source code for our CVPR 2019 paper - PPGNet: Learning Point-Pair Graph for Line Segment Detection

Towards Long-Form Video Understanding

Copy Paste positive polyp using poisson image blending for medical image segmentation

Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

Noise Conditional Score Networks (NeurIPS 2019, Oral)

BTC-Generator - BTC Generator With Python

SEC'21: Sparse Bitmap Compression for Memory-Efficient Training onthe Edge

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

Code for our paper A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization,

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

Official repo for our 3DV 2021 paper "Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements".

Plotting points that lie on the intersection of the given curves using gradient descent.

ChatBot-Pytorch - A GPT-2 ChatBot implemented using Pytorch and Huggingface-transformers

This is just a funny project that we want to see AutoEncoder (AE) can actually work to enhance the features we want

Stochastic gradient descent with model building

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

A geometric deep learning pipeline for predicting protein interface contacts.

Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.