Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Overview

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image

This repository contains the PyTorch implementation of the paper: Yichao Zhou, Haozhi Qi, Yuexiang Zhai, Qi Sun, Zhili Chen, Li-Yi Wei, Yi Ma. "Learning to Reconstruct 3D Manhattan Wireframes From a Single Image", ICCV 2019.

Introduction

The goal of this project is to explore the idea of reconstructing high-quality compact CAD-like 3D models from images. We propose a method to create accurate 3D wireframe representation from a single image by exploiting global structural regularities. Our method uses a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points.

Qualitative Results

Input Predicted Input Predicted

Code Structure

Below is a quick overview of the function of key files.

########################### Data ###########################
data/
    SU3/                        # default folder for the scenecity 3D dataset
logs/                           # default folder for storing the output during training
########################### Code ###########################
config/                         # neural network hyper-parameters and configurations
wireframe/                      # module so you can "import wireframe" in scripts
train.py                        # script for training and evaluating the neural network
vectorize_u3d.py                # script for turning the 2.5D results into 3D wireframe

Reproducing Results

Installation

You are suggested to install miniconda before following executing the following commands.

git clone https://github.com/zhou13/shapeunity
cd shapeunity
conda create -y -n shapeunity
source activate shapeunity
conda install -y pyyaml docopt matplotlib scikit-image opencv tqdm
# Replace cudatoolkit=10.2 with your CUDA version: https://pytorch.org/get-started/
conda install -y pytorch cudatoolkit=10.2 -c pytorch
python -m pip install --upgrade vispy cvxpy
mkdir data logs

Downloading the Processed Datasets

Make sure curl is installed on your system and execute

cd data
../misc/gdrive-download.sh 1-TABJjT4-_yzE-iRD-n_yIJ9Kwzzkm7X SU3.zip
unzip SU3.zip
rm *.zip
cd ..

Note: If your downloaded zip file is corrupted, it is likely due to the restriction on the amount of data that can be downloaded from my account per day. In that case, you can try to download the pre-processed dataset manually from our Google Drive and proceed accordingly.

Downloading the Pre-trained Models

Execute the following command to download and unzip the pre-trained models.

cd logs
../misc/gdrive-download.sh 1AuE3yje7jTRne2KjiVdxAWo1UT03i16a pretrained-wireframe.zip
../misc/gdrive-download.sh 1YwPMbAHnxSA3BgiM5Q26mKSTjd46OYRo pretrained-vanishing-points.zip
unzip pretrained-wireframe.zip
unzip pretrained-vanishing-points.zip
rm *.zip
cd ..

Alternatively, you can download them at this Google Drive link and this Google Drive link, respectively.

Training (Optional)

If you want to train the model yourself rather than using the pre-trained models, execute the following commands to train the neural networks from scratch with four GPUs (specified by -d 0,1,2,3):

python ./train.py -d 0,1,2,3 --identifier baseline config/hourglass.yaml

The checkpoints and logs will be written to logs/ accordingly.

We note that vanishing points are only supported by the neural network under the git branch vanishing-points. You need to visit that part of the code with git checkout vanishing-points for training the network with the vanishing point branch.

Predicting the 2.5D Wireframe (Optional)

Execute the following command to evaluate the neural network on the validation split:

python train.py --eval -d 0 -i default --from logs/pretrained-wireframe/checkpoint_latest.pth.tar logs/pretrained-wireframe/config.yaml

This command should generate a new folder under the logs directory with results in the npz folders.

Vectorization & Visualization

To visualize the working examples of ShapeUnity, execute the following commands:

python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 57
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 100
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 109
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 141
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 299

Evaluation (Optional)

To quantitatively evaluate the wireframe quality of ShapeUnity, execute the following command:

python eval_2d3d_metric.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000

The details of the sAP-10 metric can be found in the paper LCNN.

Acknowledgement

This work is supported by a research grant from Sony Research. We thank Xili Dai for providing the sAP evaluation script for the project.

Citing ShapeUnity

If you find this project useful in your research, please consider citing:

@inproceedings{zhou2019learning,
  title={Learning to Reconstruct 3D Manhattan Wireframes From a Single Image},
  author={Zhou, Yichao and Qi, Haozhi and Zhai, Yuexiang and Sun, Qi and Chen, Zhili and Wei, Li-Yi and Ma, Yi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2019}
}
Owner
Yichao Zhou
Apple Inc. | Ph.D. at UC Berkeley
Yichao Zhou
Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

Semantic Segmentation Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch Features Applicable to followin

sithu3 530 Jan 05, 2023
Code for the paper "Location-aware Single Image Reflection Removal"

Location-aware Single Image Reflection Removal The shown images are provided by the datasets from IBCLN, ERRNet, SIR2 and the Internet images. The cod

72 Dec 08, 2022
Empower Sequence Labeling with Task-Aware Language Model

LM-LSTM-CRF Check Our New NER Toolkit 🚀 🚀 🚀 Inference: LightNER: inference w. models pre-trained / trained w. any following tools, efficiently. Tra

Liyuan Liu 838 Jan 05, 2023
Lenia - Mathematical Life Forms

For full version list, see Timeline in Lenia portal [2020-10-13] Update Python version with multi-kernel and multi-channel extensions (v3.4 LeniaNDK.p

Bert Chan 3.1k Dec 28, 2022
Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging Minimal implementation and experiments of "No-Transaction Band N

19 Jan 03, 2023
pytorch bert intent classification and slot filling

pytorch_bert_intent_classification_and_slot_filling 基于pytorch的中文意图识别和槽位填充 说明 基本思路就是:分类+序列标注(命名实体识别)同时训练。 使用的预训练模型:hugging face上的chinese-bert-wwm-ext 依

西西嘛呦 33 Dec 15, 2022
Learn other languages ​​using artificial intelligence with python.

The main idea of ​​the project is to facilitate the learning of other languages. We created a simple AI that will interact with you. Just ask questions that if she knows, she will answer.

Pedro Rodrigues 2 Jun 07, 2022
A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Brain Augmented Reality (AR) A neuroanatomy-based augmented reality experience powered by computer vision that features 3D visuals of the Atlas Brain

Yasmeen Brain 10 Oct 06, 2022
JAXDL: JAX (Flax) Deep Learning Library

JAXDL: JAX (Flax) Deep Learning Library Simple and clean JAX/Flax deep learning algorithm implementations: Soft-Actor-Critic (arXiv:1812.05905) Transf

Patrick Hart 4 Nov 27, 2022
Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

APR The repo for the paper Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study. Environment setu

ielab 8 Nov 26, 2022
AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-l

Facebook Research 4.6k Jan 09, 2023
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor.

Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor. It is devel

33 Nov 11, 2022
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

27 Dec 10, 2022
✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

How Robust are Fact Checking Systems on Colloquial Claims? Official PyTorch implementation of our NAACL paper: Byeongchang Kim*, Hyunwoo Kim*, Seokhee

Byeongchang Kim 19 Mar 15, 2022
This package is for running the semantic SLAM algorithm using extracted planar surfaces from the received detection

Semantic SLAM This package can perform optimization of pose estimated from VO/VIO methods which tend to drift over time. It uses planar surfaces extra

Hriday Bavle 125 Dec 02, 2022
I3-master-layout - Simple master and stack layout script

Simple master and stack layout script | ------ | ----- | | | | | Ma

Tobias S 18 Dec 05, 2022
The author's officially unofficial PyTorch BigGAN implementation.

BigGAN-PyTorch The author's officially unofficial PyTorch BigGAN implementation. This repo contains code for 4-8 GPU training of BigGANs from Large Sc

Andy Brock 2.6k Jan 02, 2023
This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging To appear on KDD'21...[pdf] This project provides an unsupervised framework for mining and

Xiaotao Gu 146 Dec 22, 2022
RobustVideoMatting and background composing in one model by using onnxruntime.

RVM_onnx_compose RobustVideoMatting and background composing in one model by using onnxruntime. Usage pip install -r requirements.txt python infer_cam

Quantum Liu 4 Apr 07, 2022
Sharing of contents on mitochondrial encounter networks

mito-network-sharing Sharing of contents on mitochondrial encounter networks Required: R with igraph, brainGraph, ggplot2, and XML libraries; igraph l

Stochastic Biology Group 0 Oct 01, 2021