Code for ICCV 2021 paper Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs

Overview

Graph-to-3D

This is the official implementation of the paper Graph-to-3d: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs | arxiv
Helisa Dhamo*, Fabian Manhardt*, Nassir Navab, Federico Tombari
ICCV 2021

We address the novel problem of fully-learned 3D scene generation and manipulation from scene graphs, in which a user can specify in the nodes or edges of a semantic graph what they wish to see in the 3D scene.

If you find this code useful in your research, please cite

@inproceedings{graph2scene2021,
  title={Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs},
  author={Dhamo, Helisa and Manhardt, Fabian and Navab, Nassir and Tombari, Federico},
  booktitle={IEEE International Conference on Computer Vision (ICCV)},
  year={2021}
}

Setup

We have tested it on Ubuntu 16.04 with Python 3.7 and PyTorch 1.2.0

Code

# clone this repository and move there
git clone https://github.com/he-dhamo/graphto3d.git
cd graphto3d
# create a conda environment and install the requirments
conda create --name g2s_env python=3.7 --file requirements.txt 
conda activate g2s_env          # activate virtual environment
# install pytorch and cuda version as tested in our work
conda install pytorch==1.2.0 cudatoolkit=10.0 -c pytorch
# more pip installations
pip install tensorboardx graphviz plyfile open3d==0.9.0.0 open3d-python==0.7.0.0 
# Set python path to current project
export PYTHONPATH="$PWD"

To evaluate shape diversity, you will need to setup the Chamfer distance. Download the extension folder from the AtlasNetv2 repo and install it following their instructions:

cd ./extension
python setup.py install

To download our checkpoints for our trained models and the Atlasnet weights used to obtain shape features:

cd ./experiments
chmod +x ./download_checkpoints.sh && ./download_checkpoints.sh

Dataset

Download the 3RScan dataset from their official site. You will need to download the following files using their script:

python download.py -o /path/to/3RScan/ --type semseg.v2.json
python download.py -o /path/to/3RScan/ --type labels.instances.annotated.v2.ply

Additionally, download the metadata for 3RScan:

cd ./GT
chmod +x ./download_metadata_3rscan.sh && ./download_metadata_3rscan.sh

Download the 3DSSG data files to the ./GT folder:

chmod +x ./download_3dssg.sh && ./download_3dssg.sh

We use the scene splits with up to 9 objects per scene from the 3DSSG paper. The relationships here are preprocessed to avoid the two-sided annotation for spatial relationships, as these can lead to paradoxes in the manipulation task. Finally, you will need our directed aligned 3D bounding boxes introduced in our project page. The following scripts downloads these data.

chmod +x ./download_postproc_3dssg.sh && ./download_postproc_3dssg.sh

Run the transform_ply.py script from this repo to obtain 3RScan scans in the correct alignment:

cd ..
python scripts/transform_ply.py --data_path /path/to/3RScan

Training

To train our main model with shared shape and layout embedding run:

python scripts/train_vaegan.py --network_type shared --exp ./experiments/shared_model --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual True

To run the variant with separate (disentangled) layout and shape features:

python scripts/train_vaegan.py --network_type dis --exp ./experiments/separate_baseline --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual True

For the 3D-SLN baseline run:

python scripts/train_vaegan.py --network_type sln --exp ./experiments/sln_baseline --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual False --with_manipulator False --with_changes False --weight_D_box 0 --with_shape_disc False

One relevant parameter is --with_feats. If set to true, this tries to read shape features directly instead of reading point clouds and feading them in AtlasNet to obtain the feature. If features are not yet to be found, it generates them during the first epoch, and reads these stored features instead of points in the next epochs. This saves a lot of time at training.

Each training experiment generates an args.json configuration file that can be used to read the right parameters during evaluation.

Evaluation

To evaluate the models run

python scripts/evaluate_vaegan.py --dataset_3RScan ../3RScan_v2/data/ --exp ./experiments/final_checkpoints/shared --with_points False --with_feats True --epoch 100 --path2atlas ./experiments/atlasnet/model_70.pth --evaluate_diversity False

Set --evaluate_diversity to True if you want to compute diversity. This takes a while, so it's disabled by default. To run the 3D-SLN baseline, or the variant with separate layout and shape features, simply provide the right experiment folder in --exp.

Acknowledgements

This repository contains code parts that are based on 3D-SLN and AtlasNet. We thank the authors for making their code available.

Owner
Helisa Dhamo
Helisa Dhamo
Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

[AAAI 2021]DropLoss for Long-Tail Instance Segmentation [AAAI 2021] DropLoss for Long-Tail Instance Segmentation Ting-I Hsieh*, Esther Robb*, Hwann-Tz

Tim 37 Dec 02, 2022
Official PyTorch implementation of RIO

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection Figure 1: Our proposed Resampling at image-level and obect-

NVIDIA Research Projects 17 May 20, 2022
A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

daniel grzech 14 Nov 21, 2022
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors

You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors In this paper, we propose a novel local descriptor-based fra

Haiping Wang 80 Dec 15, 2022
Python script that allows you to automatically setup your Growtopia server.

AutoSetup Python script that allows you to automatically setup your Growtopia server. How To Use Firstly, install all the required modules that used i

Aspire 3 Mar 06, 2022
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

Devsisters Corp. 2.4k Dec 26, 2022
DIT is a DTLS MitM proxy implemented in Python 3. It can intercept, manipulate and suppress datagrams between two DTLS endpoints and supports psk-based and certificate-based authentication schemes (RSA + ECC).

DIT - DTLS Interception Tool DIT is a MitM proxy tool to intercept DTLS traffic. It can intercept, manipulate and/or suppress DTLS datagrams between t

52 Nov 30, 2022
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan

SIV-LAB 22 Aug 31, 2022
CM building dataset Timisoara

CM_building_dataset_Timisoara Date created: Febr-2020 The Timi\c{s}oara Building Dataset - TMBuD - is composed of 160 images with the resolution of 76

Orhei Ciprian 5 Sep 07, 2022
Code release for Universal Domain Adaptation(CVPR 2019)

Universal Domain Adaptation Code release for Universal Domain Adaptation(CVPR 2019) Requirements python 3.6+ PyTorch 1.0 pip install -r requirements.t

THUML @ Tsinghua University 229 Dec 23, 2022
A TensorFlow implementation of the Mnemonic Descent Method.

MDM A Tensorflow implementation of the Mnemonic Descent Method. Mnemonic Descent Method: A recurrent process applied for end-to-end face alignment G.

123 Oct 07, 2022
Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

next_best_view_rl Setup Clone the repository: git clone --recurse-submodules ... In 'third_party/zed-ros-wrapper': git checkout devel Install mujoco `

Christian Korbach 1 Feb 15, 2022
End-to-End Object Detection with Fully Convolutional Network

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

472 Dec 22, 2022
A library for graph deep learning research

Documentation | Paper [JMLR] | Tutorials | Benchmarks | Examples DIG: Dive into Graphs is a turnkey library for graph deep learning research. Why DIG?

DIVE Lab, Texas A&M University 1.3k Jan 01, 2023
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Dec 28, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 08, 2022
A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Bag of tricks for long-tailed visual recognition with deep convolutional neural networks This repository is the official PyTorch implementation of AAA

Yong-Shun Zhang 181 Dec 28, 2022
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

imutils A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displ

Adrian Rosebrock 4.3k Jan 08, 2023