Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

Related tags

Deep Learningcliora
Overview

CLIORA

This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling.

We introduce a new task of Unsupervised Vision-Language Grammar Induction and devise a model Contrastive Language-Image inside-Outside Recursive Autoencoder (CLIORA) to solve it. Please read our paper for more details: https://openreview.net/forum?id=N0n_QyQ5lBF.

This code follows the implementation architecture of DIORA.

Please cite our paper as follows:

@inproceedings{wan2022cliora,
  title={Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling},
  author={Wan, Bo and Han, Wenjuan and Zheng, Zilong and Tuytelaars, Tinne},
  booktitle={The International Conference on Learning Representations (ICLR)},
  year={2022},
}

Envs and Datas

Install dependencies (using Conda as a virtual environment):

conda create -n cliora python=3.8
source activate cliora
pip install -r requirements.txt

Download flickr_data and outputs and put the files as the following structure:

  cliora
  ├───cliora
  │   ├─...
  │
  ├───flickr_data
  │   ├─flickr_feat_maf
  │
  ├───outputs
      ├─flickr

We use the same object features as MAF. Download train_features_compress.hdf5, val features_compress.hdf5, test features_compress.hdf5 to flickr_data/flickr_feat_maf.

Running CLIORA

export PYTHONPATH=$(pwd):$PYTHONPATH


## Train DIORA
sh train_diora.sh

## Test DIORA
sh test_diora.sh

## Train CLOIRA based on DIORA
sh train_clora.sh

## Test CLIORA 
sh test_cliora.sh

Multi-GPU Training

Single-GPU training:

export CUDA_VISIBLE_DEVICES=0
python -m cliora/scripts/train.py
    --cuda
    ... # other args

Multi-GPU Training:

export CUDA_VISIBLE_DEVICES=0,1,2,3
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS cliora/scripts/train.py
    --cuda
    --multigpu
    ... # other args

Visualization

Download Flickr30K Entities Dataset and put the image folder flickr_images under flickr_data/. Add --visualize when run test_cliora.sh:

# test_cliora.sh
python cliora/scripts/parse.py
    --cuda
    --visualize
    --obj_feats
    ... # other args

Word Embedding

We provide randomly-initialized word embedding, skip-thoughts embedding and ELMo embedding. If you use ELMo embedding and specify the --elmo_cache_dir, then the context-insensitive ELMo vectors will be cached, making it much faster to load these vectors after the initial usage.

Example Usage:

word_emb=none/skip/elmo

python cliora/scripts/train.py
    --emb word_emb
    ... # other args

License

Copyright 2018, University of Massachusetts Amherst

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Owner
Bo Wan
Visual UnderStanding; Computer Vision
Bo Wan
Python Fanduel API (2021) - Lineup Automation

Southpaw is a python package that provides access to the Fanduel API. Optimize your DFS experience by programmatically updating your lineups, analyzin

Brandin Canfield 13 Jan 04, 2023
Website for D2C paper

D2C This is the repository that contains source code for the D2C Website. If you find D2C useful for your work please cite: @article{sinha2021d2c au

1 Oct 21, 2021
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts

Yizhi Wang 146 Dec 18, 2022
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 04, 2023
Laplacian Score-regularized Concrete Autoencoders

Laplacian Score-regularized Concrete Autoencoders Requirements: torch = 1.9 scikit-learn = 0.24 omegaconf = 2.0.6 scipy = 1.6.0 matplotlib How to

JS 6 Dec 07, 2022
A PyTorch implementation of the architecture of Mask RCNN

EDIT (AS OF 4th NOVEMBER 2019): This implementation has multiple errors and as of the date 4th, November 2019 is insufficient to be utilized as a reso

Sai Himal Allu 975 Dec 30, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 05, 2022
Code for "AutoMTL: A Programming Framework for Automated Multi-Task Learning"

AutoMTL: A Programming Framework for Automated Multi-Task Learning This is the website for our paper "AutoMTL: A Programming Framework for Automated M

Ivy Zhang 40 Dec 04, 2022
Contrastively Disentangled Sequential Variational Audoencoder

Contrastively Disentangled Sequential Variational Audoencoder (C-DSVAE) Overview This is the implementation for our C-DSVAE, a novel self-supervised d

Junwen Bai 35 Dec 24, 2022
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat

AIStream 1.2k Jan 04, 2023
Invasive Plant Species Identification

Invasive_Plant_Species_Identification Used LiDAR Odometry and Mapping (LOAM) to create a 3D point cloud map which can be used to identify invasive pla

2 May 12, 2022
Large-scale language modeling tutorials with PyTorch

Large-scale language modeling tutorials with PyTorch 안녕하세요. 저는 TUNiB에서 머신러닝 엔지니어로 근무 중인 고현웅입니다. 이 자료는 대규모 언어모델 개발에 필요한 여러가지 기술들을 소개드리기 위해 마련하였으며 기본적으로

TUNiB 172 Dec 29, 2022
A python library for self-supervised learning on images.

Lightly is a computer vision framework for self-supervised learning. We, at Lightly, are passionate engineers who want to make deep learning more effi

Lightly 2k Jan 08, 2023
Implementation of "Learning to Match Features with Seeded Graph Matching Network" ICCV2021

SGMNet Implementation PyTorch implementation of SGMNet for ICCV'21 paper "Learning to Match Features with Seeded Graph Matching Network", by Hongkai C

87 Dec 11, 2022
Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021) Jiaxi Jiang, Kai Zhang, Radu Timofte Computer Vision Lab, ETH Zurich, Switzerland 🔥

Jiaxi Jiang 282 Jan 02, 2023
PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022
A two-stage U-Net for high-fidelity denoising of historical recordings

A two-stage U-Net for high-fidelity denoising of historical recordings Official repository of the paper (not submitted yet): E. Moliner and V. Välimäk

Eloi Moliner Juanpere 57 Jan 05, 2023
ARAE-Tensorflow for Discrete Sequences (Adversarially Regularized Autoencoder)

ARAE Tensorflow Code Code for the paper Adversarially Regularized Autoencoders for Generating Discrete Structures by Zhao, Kim, Zhang, Rush and LeCun

19 Nov 12, 2021
Source code for the BMVC-2021 paper "SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation".

SimReg: A Simple Regression Based Framework for Self-supervised Knowledge Distillation Source code for the paper "SimReg: Regression as a Simple Yet E

9 Oct 15, 2022
NVIDIA container runtime

nvidia-container-runtime A modified version of runc adding a custom pre-start hook to all containers. If environment variable NVIDIA_VISIBLE_DEVICES i

NVIDIA Corporation 938 Jan 06, 2023