A brand new hub for Scene Graph Generation methods based on MMdetection (2021). The pipeline of from detection, scene graph generation to downstream tasks (e.g., image cpationing) is supported. Pytorch version implementation of HetH (ECCV 2020) and TopicSG (ICCV 2021) is included.

Overview

MMSceneGraph

LICENSE Python PyTorch

Introduction

MMSceneneGraph is an open source code hub for scene graph generation as well as supporting downstream tasks based on the scene graph on PyTorch. The frontend object detector is supported by open-mmlab/mmdetection.

demo image

Major features

  • Modular design

    We decompose the framework into different components and one can easily construct a customized scene graph generation framework by combining different modules.

  • Support of multiple frameworks out of box

    The toolbox directly supports popular and contemporary detection frameworks, e.g. Faster RCNN, Mask RCNN, etc.

  • Visualization support

    The visualization of the groundtruth/predicted scene graph is integrated into the toolbox.

License

This project is released under the MIT license.

Changelog

Please refer to CHANGELOG.md for details.

Benchmark and model zoo

The original object detection results and models provided by mmdetection are available in the model zoo. The models for the scene graph generation are temporarily unavailable yet.

Supported methods and Datasets

Supported SGG (VRD) methods:

  • Neural Motifs (CVPR'2018)
  • VCTree (CVPR'2019)
  • TDE (CVPR'2020)
  • VTransE (CVPR'2017)
  • IMP (CVPR'2017)
  • KERN (CVPR'2019)
  • GPSNet (CVPR'2020)
  • HetH (ECCV'2020, ours)
  • TopicSG (ICCV'2021, ours)

Supported saliency object detection methods:

  • R3Net (IJCAI'2018)
  • SCRN (ICCV'2019)

Supported image captioning methods:

  • bottom-up (CVPR'2018)
  • XLAN (CVPR'2020)

Supported datasets:

  • Visual Genome: VG150 (CVPR'2017)
  • VRD (ECCV'2016)
  • Visual Genome: VG200/VG-KR (ours)
  • MSCOCO (for object detection, image caption)
  • RelCap (from VG and COCO, ours)

Installation

As our project is built on mmdetection 1.x (which is a bit different from their current master version 2.x), please refer to INSTALL.md. If you want to use mmdetection 2.x, please refer to mmdetection/get_start.md.

Getting Started

Please refer to GETTING_STARTED.md for using the projects. We will update it constantly.

Acknowledgement

We appreciate the contributors of the mmdetection project and Scene-Graph-Benchmark.pytorch which inspires our design.

Citation

If you find this code hub or our works useful in your research works, please consider citing:

@inproceedings{wang2021topic,
  title={Topic Scene Graph Generation by Attention Distillation from Caption},
  author={Wang, Wenbin and Wang, Ruiping and Chen, Xilin},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={15900--15910},
  month = {October},
  year={2021}
}


@inproceedings{wang2020sketching,
  title={Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation},
  author={Wang, Wenbin and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
  booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
  pages={222--239},
  year={2020},
  volume={12358},
  doi={10.1007/978-3-030-58601-0_14},
  publisher={Springer}
}

@InProceedings{Wang_2019_CVPR,
author = {Wang, Wenbin and Wang, Ruiping and Shan, Shiguang and Chen, Xilin},
title = {Exploring Context and Visual Pattern of Relationship for Scene Graph Generation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
pages = {8188-8197},
month = {June},
address = {Long Beach, California, USA},
doi = {10.1109/CVPR.2019.00838},
year = {2019}
}
Owner
Kenneth-Wong
http://www.kennethwong.tech/
Kenneth-Wong
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

HiFiGAN Denoiser This is a Unofficial Pytorch implementation of the paper HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep F

Rishikesh (ऋषिकेश) 134 Dec 27, 2022
Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

Neelesh C A 3 Oct 14, 2022
Pytorch implementation for "Open Compound Domain Adaptation" (CVPR 2020 ORAL)

Open Compound Domain Adaptation [Project] [Paper] [Demo] [Blog] Overview Open Compound Domain Adaptation (OCDA) is the author's re-implementation of t

Zhongqi Miao 137 Dec 15, 2022
Face Recognition plus identification simply and fast | Python

PyFaceDetection Face Recognition plus identification simply and fast Ubuntu Setup sudo pip3 install numpy sudo pip3 install cmake sudo pip3 install dl

Peyman Majidi Moein 16 Sep 22, 2022
Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

1.1k Jan 03, 2023
Implementation of CVAE. Trained CVAE on faces from UTKFace Dataset to produce synthetic faces with a given degree of happiness/smileyness.

Conditional Smiles! (SmileCVAE) About Implementation of AE, VAE and CVAE. Trained CVAE on faces from UTKFace Dataset. Using an encoding of the Smile-s

Raúl Ortega 3 Jan 09, 2022
Unsupervised Representation Learning via Neural Activation Coding

Neural Activation Coding This repository contains the code for the paper "Unsupervised Representation Learning via Neural Activation Coding" published

yookoon park 5 May 26, 2022
Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

DisUnknown: Distilling Unknown Factors for Disentanglement Learning See introduction on our project page Requirements PyTorch = 1.8.0 torch.linalg.ei

Sitao Xiang 24 May 16, 2022
StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmenta

NVIDIA Research Projects 3.2k Dec 30, 2022
Image inpainting using Gaussian Mixture Models

dmfa_inpainting Source code for: MisConv: Convolutional Neural Networks for Missing Data (to be published at WACV 2022) Estimating conditional density

Marcin Przewięźlikowski 8 Oct 09, 2022
Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Face-Mesh Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices. It employs machine learning

Farnam Javadi 9 Dec 21, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

Jia Research Lab 115 Dec 23, 2022
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

TorchArrow (Warning: Unstable Prototype) This is a prototype library currently under heavy development. It does not currently have stable releases, an

Facebook Research 536 Jan 06, 2023
PAthological QUpath Obsession - QuPath and Python conversations

PAQUO: PAthological QUpath Obsession Welcome to paquo 👋 , a library for interacting with QuPath from Python. paquo's goal is to provide a pythonic in

Bayer AG 60 Dec 31, 2022
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
Monk is a low code Deep Learning tool and a unified wrapper for Computer Vision.

Monk - A computer vision toolkit for everyone Why use Monk Issue: Want to begin learning computer vision Solution: Start with Monk's hands-on study ro

Tessellate Imaging 507 Dec 04, 2022
Cross-modal Deep Face Normals with Deactivable Skip Connections

Cross-modal Deep Face Normals with Deactivable Skip Connections Victoria Fernández Abrevaya*, Adnane Boukhayma*, Philip H. S. Torr, Edmond Boyer (*Equ

72 Nov 27, 2022