Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Last update: Dec 15, 2022

Overview

NonCuboidRoom

Paper

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Cheng Yang*, Jia Zheng*, Xili Dai, Rui Tang, Yi Ma, Xiaojun Yuan.

[Preprint] [Supplementary Material]

(*: Equal contribution)

Installation

The code is tested with Ubuntu 16.04, PyTorch v1.5, CUDA 10.1 and cuDNN v7.6.

# create conda env
conda create -n layout python=3.6
# activate conda env
conda activate layout
# install pytorch
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
# install dependencies
pip install -r requirements.txt

Data Preparation

Structured3D Dataset

Please download Structured3D dataset and our processed 2D line annotations. The directory structure should look like:

data
└── Structured3D
    │── Structured3D
    │   ├── scene_00000
    │   ├── scene_00001
    │   ├── scene_00002
    │   └── ...
    └── line_annotations.json

SUN RGB-D Dataset

Please download SUN RGB-D dataset, our processed 2D line annotation for SUN RGB-D dataset, and layout annotations of NYUv2 303 dataset. The directory structure should look like:

data
└── SUNRGBD
    │── SUNRGBD
    │    ├── kv1
    │    ├── kv2
    │    ├── realsense
    │    └── xtion
    │── sunrgbd_train.json      // our extracted 2D line annotations of SUN RGB-D train set
    │── sunrgbd_test.json       // our extracted 2D line annotations of SUN RGB-D test set
    └── nyu303_layout_test.npz  // 2D ground truth layout annotations provided by NYUv2 303 dataset

Pre-trained Models

You can download our pre-trained models here:

The model trained on Structured3D dataset.
The model trained on SUN RGB-D dataset and NYUv2 303 dataset.

Structured3D Dataset

To train the model on the Structured3D dataset, run this command:

python train.py --model_name s3d --data Structured3D

To evaluate the model on the Structured3D dataset, run this command:

python test.py --pretrained DIR --data Structured3D

NYUv2 303 Dataset

To train the model on the SUN RGB-D dataset and NYUv2 303 dataset, run this command:

# first fine-tune the model on the SUN RGB-D dataset
python train.py --model_name sunrgbd --data SUNRGBD --pretrained Structure3D_DIR --split all --lr_step []
# Then fine-tune the model on the NYUv2 subset
python train.py --model_name nyu --data SUNRGBD --pretrained SUNRGBD_DIR --split nyu --lr_step [] --epochs 10

To evaluate the model on the NYUv2 303 dataset, run this command:

python test.py --pretrained DIR --data NYU303

Inference on the customized data

To predict the results of customized images, run this command:

python test.py --pretrained DIR --data CUSTOM

Citation

@article{NonCuboidRoom,
  title   = {Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image},
  author  = {Cheng Yang and
             Jia Zheng and
             Xili Dai and
             Rui Tang and
             Yi Ma and
             Xiaojun Yuan},
  journal = {CoRR},
  volume  = {abs/2104.07986},
  year    = {2021}
}

LICENSE

The code is released under the MIT license. Portions of the code are borrowed from HRNet-Object-Detection and CenterNet.

Acknowledgements

We would like to thank Lei Jin for providing us the code for parsing the layout annotations in SUN RGB-D dataset.

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Related tags

Overview

NonCuboidRoom

Paper

Installation

Data Preparation

Structured3D Dataset

SUN RGB-D Dataset

Pre-trained Models

Structured3D Dataset

NYUv2 303 Dataset

Inference on the customized data

Citation

LICENSE

Acknowledgements

Owner

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

RaceBERT -- A transformer based model to predict race and ethnicty from names

SlotRefine: A Fast Non-Autoregressive Model forJoint Intent Detection and Slot Filling

Learnable Motion Coherence for Correspondence Pruning

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

This is the code of using DQN to play Sekiro .

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

PyTorch implementation of DCT fast weight RNNs

Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murder rates etc.

Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"