Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Last update: Dec 27, 2022

Related tags

Overview

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image

This repository contains the PyTorch implementation of the paper: Yichao Zhou, Haozhi Qi, Yuexiang Zhai, Qi Sun, Zhili Chen, Li-Yi Wei, Yi Ma. "Learning to Reconstruct 3D Manhattan Wireframes From a Single Image", ICCV 2019.

Introduction

The goal of this project is to explore the idea of reconstructing high-quality compact CAD-like 3D models from images. We propose a method to create accurate 3D wireframe representation from a single image by exploiting global structural regularities. Our method uses a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points.

Qualitative Results

Input	Predicted	Input	Predicted

Code Structure

Below is a quick overview of the function of key files.

########################### Data ###########################
data/
    SU3/                        # default folder for the scenecity 3D dataset
logs/                           # default folder for storing the output during training
########################### Code ###########################
config/                         # neural network hyper-parameters and configurations
wireframe/                      # module so you can "import wireframe" in scripts
train.py                        # script for training and evaluating the neural network
vectorize_u3d.py                # script for turning the 2.5D results into 3D wireframe

Reproducing Results

Installation

You are suggested to install miniconda before following executing the following commands.

git clone https://github.com/zhou13/shapeunity
cd shapeunity
conda create -y -n shapeunity
source activate shapeunity
conda install -y pyyaml docopt matplotlib scikit-image opencv tqdm
# Replace cudatoolkit=10.2 with your CUDA version: https://pytorch.org/get-started/
conda install -y pytorch cudatoolkit=10.2 -c pytorch
python -m pip install --upgrade vispy cvxpy
mkdir data logs

Downloading the Processed Datasets

Make sure curl is installed on your system and execute

cd data
../misc/gdrive-download.sh 1-TABJjT4-_yzE-iRD-n_yIJ9Kwzzkm7X SU3.zip
unzip SU3.zip
rm *.zip
cd ..

Note: If your downloaded zip file is corrupted, it is likely due to the restriction on the amount of data that can be downloaded from my account per day. In that case, you can try to download the pre-processed dataset manually from our Google Drive and proceed accordingly.

Downloading the Pre-trained Models

Execute the following command to download and unzip the pre-trained models.

cd logs
../misc/gdrive-download.sh 1AuE3yje7jTRne2KjiVdxAWo1UT03i16a pretrained-wireframe.zip
../misc/gdrive-download.sh 1YwPMbAHnxSA3BgiM5Q26mKSTjd46OYRo pretrained-vanishing-points.zip
unzip pretrained-wireframe.zip
unzip pretrained-vanishing-points.zip
rm *.zip
cd ..

Alternatively, you can download them at this Google Drive link and this Google Drive link, respectively.

Training (Optional)

If you want to train the model yourself rather than using the pre-trained models, execute the following commands to train the neural networks from scratch with four GPUs (specified by -d 0,1,2,3):

python ./train.py -d 0,1,2,3 --identifier baseline config/hourglass.yaml

The checkpoints and logs will be written to logs/ accordingly.

We note that vanishing points are only supported by the neural network under the git branch vanishing-points. You need to visit that part of the code with git checkout vanishing-points for training the network with the vanishing point branch.

Predicting the 2.5D Wireframe (Optional)

Execute the following command to evaluate the neural network on the validation split:

python train.py --eval -d 0 -i default --from logs/pretrained-wireframe/checkpoint_latest.pth.tar logs/pretrained-wireframe/config.yaml

This command should generate a new folder under the logs directory with results in the npz folders.

Vectorization & Visualization

To visualize the working examples of ShapeUnity, execute the following commands:

python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 57
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 100
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 109
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 141
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 299

Evaluation (Optional)

To quantitatively evaluate the wireframe quality of ShapeUnity, execute the following command:

python eval_2d3d_metric.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000

The details of the sAP-10 metric can be found in the paper LCNN.

Acknowledgement

This work is supported by a research grant from Sony Research. We thank Xili Dai for providing the sAP evaluation script for the project.

Citing ShapeUnity

If you find this project useful in your research, please consider citing:

@inproceedings{zhou2019learning,
  title={Learning to Reconstruct 3D Manhattan Wireframes From a Single Image},
  author={Zhou, Yichao and Qi, Haozhi and Zhai, Yuexiang and Sun, Qi and Chen, Zhili and Wei, Li-Yi and Ma, Yi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2019}
}

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Related tags

Overview

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image

Introduction

Qualitative Results

Code Structure

Reproducing Results

Installation

Downloading the Processed Datasets

Downloading the Pre-trained Models

Training (Optional)

Predicting the 2.5D Wireframe (Optional)

Vectorization & Visualization

Evaluation (Optional)

Acknowledgement

Citing ShapeUnity

Owner

Yichao Zhou

Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

Reinforcement learning for self-driving in a 3D simulation

Emblaze - Interactive Embedding Comparison

Ladder Variational Autoencoders (LVAE) in PyTorch

Pytorch implementation of MLP-Mixer with loading pre-trained models.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

mlpack: a scalable C++ machine learning library --

Image Captioning using CNN and Transformers

Gauge equivariant mesh cnn

DC3: A Learning Method for Optimization with Hard Constraints

Stochastic Normalizing Flows

Tensorflow implementation of soft-attention mechanism for video caption generation.

[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

ObjDetApp deploys a pytorch model for object detection

Self-Supervised Learning

Artificial Intelligence search algorithm base on Pacman

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Related tags

Overview

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image

Introduction

Qualitative Results

Code Structure

Reproducing Results

Installation

Downloading the Processed Datasets

Downloading the Pre-trained Models

Training (Optional)

Predicting the 2.5D Wireframe (Optional)

Vectorization & Visualization

Evaluation (Optional)

Acknowledgement

Citing ShapeUnity

Owner

Yichao Zhou

Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

Reinforcement learning for self-driving in a 3D simulation

Emblaze - Interactive Embedding Comparison

Ladder Variational Autoencoders (LVAE) in PyTorch

Pytorch implementation of MLP-Mixer with loading pre-trained models.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

mlpack: a scalable C++ machine learning library --

Image Captioning using CNN and Transformers

Gauge equivariant mesh cnn

DC3: A Learning Method for Optimization with Hard Constraints

Stochastic Normalizing Flows

Tensorflow implementation of soft-attention mechanism for video caption generation.

[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

*ObjDetApp* deploys a pytorch model for object detection

Self-Supervised Learning

Artificial Intelligence search algorithm base on Pacman

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

ObjDetApp deploys a pytorch model for object detection