Code for ICCV 2021 paper Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs

Overview

Graph-to-3D

This is the official implementation of the paper Graph-to-3d: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs | arxiv
Helisa Dhamo*, Fabian Manhardt*, Nassir Navab, Federico Tombari
ICCV 2021

We address the novel problem of fully-learned 3D scene generation and manipulation from scene graphs, in which a user can specify in the nodes or edges of a semantic graph what they wish to see in the 3D scene.

If you find this code useful in your research, please cite

@inproceedings{graph2scene2021,
  title={Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs},
  author={Dhamo, Helisa and Manhardt, Fabian and Navab, Nassir and Tombari, Federico},
  booktitle={IEEE International Conference on Computer Vision (ICCV)},
  year={2021}
}

Setup

We have tested it on Ubuntu 16.04 with Python 3.7 and PyTorch 1.2.0

Code

# clone this repository and move there
git clone https://github.com/he-dhamo/graphto3d.git
cd graphto3d
# create a conda environment and install the requirments
conda create --name g2s_env python=3.7 --file requirements.txt 
conda activate g2s_env          # activate virtual environment
# install pytorch and cuda version as tested in our work
conda install pytorch==1.2.0 cudatoolkit=10.0 -c pytorch
# more pip installations
pip install tensorboardx graphviz plyfile open3d==0.9.0.0 open3d-python==0.7.0.0 
# Set python path to current project
export PYTHONPATH="$PWD"

To evaluate shape diversity, you will need to setup the Chamfer distance. Download the extension folder from the AtlasNetv2 repo and install it following their instructions:

cd ./extension
python setup.py install

To download our checkpoints for our trained models and the Atlasnet weights used to obtain shape features:

cd ./experiments
chmod +x ./download_checkpoints.sh && ./download_checkpoints.sh

Dataset

Download the 3RScan dataset from their official site. You will need to download the following files using their script:

python download.py -o /path/to/3RScan/ --type semseg.v2.json
python download.py -o /path/to/3RScan/ --type labels.instances.annotated.v2.ply

Additionally, download the metadata for 3RScan:

cd ./GT
chmod +x ./download_metadata_3rscan.sh && ./download_metadata_3rscan.sh

Download the 3DSSG data files to the ./GT folder:

chmod +x ./download_3dssg.sh && ./download_3dssg.sh

We use the scene splits with up to 9 objects per scene from the 3DSSG paper. The relationships here are preprocessed to avoid the two-sided annotation for spatial relationships, as these can lead to paradoxes in the manipulation task. Finally, you will need our directed aligned 3D bounding boxes introduced in our project page. The following scripts downloads these data.

chmod +x ./download_postproc_3dssg.sh && ./download_postproc_3dssg.sh

Run the transform_ply.py script from this repo to obtain 3RScan scans in the correct alignment:

cd ..
python scripts/transform_ply.py --data_path /path/to/3RScan

Training

To train our main model with shared shape and layout embedding run:

python scripts/train_vaegan.py --network_type shared --exp ./experiments/shared_model --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual True

To run the variant with separate (disentangled) layout and shape features:

python scripts/train_vaegan.py --network_type dis --exp ./experiments/separate_baseline --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual True

For the 3D-SLN baseline run:

python scripts/train_vaegan.py --network_type sln --exp ./experiments/sln_baseline --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual False --with_manipulator False --with_changes False --weight_D_box 0 --with_shape_disc False

One relevant parameter is --with_feats. If set to true, this tries to read shape features directly instead of reading point clouds and feading them in AtlasNet to obtain the feature. If features are not yet to be found, it generates them during the first epoch, and reads these stored features instead of points in the next epochs. This saves a lot of time at training.

Each training experiment generates an args.json configuration file that can be used to read the right parameters during evaluation.

Evaluation

To evaluate the models run

python scripts/evaluate_vaegan.py --dataset_3RScan ../3RScan_v2/data/ --exp ./experiments/final_checkpoints/shared --with_points False --with_feats True --epoch 100 --path2atlas ./experiments/atlasnet/model_70.pth --evaluate_diversity False

Set --evaluate_diversity to True if you want to compute diversity. This takes a while, so it's disabled by default. To run the 3D-SLN baseline, or the variant with separate layout and shape features, simply provide the right experiment folder in --exp.

Acknowledgements

This repository contains code parts that are based on 3D-SLN and AtlasNet. We thank the authors for making their code available.

Owner
Helisa Dhamo
Helisa Dhamo
This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

PFD:Pose-guided Feature Disentangling for Occluded Person Re-identification based on Transformer This repo is the official implementation of "Pose-gui

Tao Wang 93 Dec 18, 2022
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022
FlingBot: The Unreasonable Effectiveness of Dynamic Manipulations for Cloth Unfolding

This repository contains code for training and evaluating FlingBot in both simulation and real-world settings on a dual-UR5 robot arm setup for Ubuntu 18.04

Columbia Artificial Intelligence and Robotics Lab 70 Dec 06, 2022
Multi-objective constrained optimization for energy applications via tree ensembles

Multi-objective constrained optimization for energy applications via tree ensembles

C⚙G - Imperial College London 1 Nov 19, 2021
Search and filter videos based on objects that appear in them using convolutional neural networks

Thingscoop: Utility for searching and filtering videos based on their content Description Thingscoop is a command-line utility for analyzing videos se

Anastasis Germanidis 354 Dec 04, 2022
SIR model parameter estimation using a novel algorithm for differentiated uniformization.

TenSIR Parameter estimation on epidemic data under the SIR model using a novel algorithm for differentiated uniformization of Markov transition rate m

The Spang Lab 4 Nov 30, 2022
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022
Implements MLP-Mixer: An all-MLP Architecture for Vision.

MLP-Mixer-CIFAR10 This repository implements MLP-Mixer as proposed in MLP-Mixer: An all-MLP Architecture for Vision. The paper introduces an all MLP (

Sayak Paul 51 Jan 04, 2023
PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

FInite volume Neural Network (FINN) This repository contains the PyTorch code for models, training, and testing, and Python code for data generation t

Cognitive Modeling 20 Dec 18, 2022
Official repository for Natural Image Matting via Guided Contextual Attention

GCA-Matting: Natural Image Matting via Guided Contextual Attention The source codes and models of Natural Image Matting via Guided Contextual Attentio

Li Yaoyi 349 Dec 26, 2022
HybVIO visual-inertial odometry and SLAM system

HybVIO A visual-inertial odometry system with an optional SLAM module. This is a research-oriented codebase, which has been published for the purposes

Spectacular AI 320 Jan 03, 2023
This code is a near-infrared spectrum modeling method based on PCA and pls

Nirs-Pls-Corn This code is a near-infrared spectrum modeling method based on PCA and pls 近红外光谱分析技术属于交叉领域,需要化学、计算机科学、生物科学等多领域的合作。为此,在(北邮邮电大学杨辉华老师团队)指导下

Fu Pengyou 6 Dec 17, 2022
Locally cache assets that are normally streamed in POPULATION: ONE

Population One Localizer This is no longer needed as of the build shipped on 03/03/22, thank you bigbox :) Locally cache assets that are normally stre

Ahman Woods 2 Mar 04, 2022
Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

.NET Interactive Notebooks for C# Welcome to the home of .NET interactive notebooks for C#! How to Install Download the .NET Coding Pack for VS Code f

.NET Platform 425 Dec 25, 2022
Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Hypercorrelation Squeeze for Few-Shot Segmentation This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juh

Juhong Min 165 Dec 28, 2022
SOTA model in CIFAR10

A PyTorch Implementation of CIFAR Tricks 调研了CIFAR10数据集上各种trick,数据增强,正则化方法,并进行了实现。目前项目告一段落,如果有更好的想法,或者希望一起维护这个项目可以提issue或者在我的主页找到我的联系方式。 0. Requirement

PJDong 58 Dec 21, 2022
Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

numbalsoda numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems.

Nick Wogan 52 Jan 09, 2023
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

Official TensorFlow implementation of the unsupervised reconstruction model using zero-Shot Learned Adversarial TransformERs (SLATER). (https://arxiv.

ICON Lab 22 Dec 22, 2022
Classify music genre from a 10 second sound stream using a Neural Network.

MusicGenreClassification Academic research in the field of Deep Learning (Deep Neural Networks) and Sound Processing, Tel Aviv University. Featured in

Matan Lachmish 453 Dec 27, 2022
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022