DeepVoxels is an object-specific, persistent 3D feature embedding.

Overview

DeepVoxels

DeepVoxels is an object-specific, persistent 3D feature embedding. It is found by globally optimizing over all available 2D observations of an object in a deeplearning framework. At test time, the training set can be discarded, and DeepVoxels can be used to render novel views of the same object.

deepvoxels_video

Usage

Installation

This code was developed in python 3.7 and pytorch 1.0. I recommend to use anaconda for dependency management. You can create an environment with name "deepvoxels" with all dependencies like so:

conda env create -f environment.yml

High-Level structure

The code is organized as follows:

  • dataio.py loads training and testing data.
  • data_util.py and util.py contain utility functions.
  • run_deepvoxels.py contains the training and testing code as well as setting up the dataset, dataloading, command line arguments etc.
  • deep_voxels.py contains the core DeepVoxels model.
  • custom_layers.py contains implementations of the integration and occlusion submodules.
  • projection.py contains utility functions for 3D and projective geometry.

Data

The datasets have been rendered from a set of high-quality 3D scans of a variety of objects. The datasets are available for download here. Each object has its own directory, which is the directory that the "data_root" command-line argument of the run_deepvoxels.py script is pointed to.

Coordinate and camera parameter conventions

This code uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates.

The code also reads an "intrinsics.txt" file from the dataset directory. This file is expected to be structured as follows:

f cx cy
origin_x origin_y origin_z
near_plane (if 0, defaults to sqrt(3)/2)
scale
img_height img_width

The focal length, cx and cy are in pixels. (origin_x, origin_y, origin_z) denotes the origin of the voxel grid in world coordinates. The near plane is also expressed in world units. Per default, each voxel has a sidelength of 1 in world units - the scale is a factor that scales the sidelength of each voxel. Finally, height and width are the resolution of the image.

To create your own dataset, I recommend using the amazing, open-source Colmap. Follow the instructions on the website to install it. I have written a little wrapper in python that will automatically reconstruct a directory of images, and then extract the camera extrinsic & intrinsic camera parameters. It can be used like so:

python colmap_wrapper.py --img_dir [path to directory with images] \
                         --trgt_dir [path where output will be written to] 

To get the scale and origin of the voxel grid as well as the near plane, one has to inspect the reconstructed point cloud and manually edit the intrinsics.txt file written out by colmap_wrapper.py.

Training

  • See python run_deepvoxels.py --help for all train options. Example train call:
python run_deepvoxels.py --train_test train \
                         --data_root [path to directory with dataset] \
                         --logging_root [path to directory where tensorboard summaries and checkpoints should be written to] 

To monitor progress, the training code writes tensorboard summaries every 100 steps into a "runs" subdirectory in the logging_root.

Testing

Example test call:

python run_deepvoxels.py --train_test test \
                         --data_root [path to directory with dataset] ]
                         --logging_root [path to directoy where test output should be written to] \
                         --checkpoint [path to checkpoint]

Misc

Citation:

If you find our work useful in your research, please consider citing:

@inproceedings{sitzmann2019deepvoxels,
	author = {Sitzmann, Vincent 
	          and Thies, Justus 
	          and Heide, Felix 
	          and Nie{\ss}ner, Matthias 
	          and Wetzstein, Gordon 
	          and Zollh{\"o}fer, Michael},
	title = {DeepVoxels: Learning Persistent 3D Feature Embeddings},
	booktitle = {Proc. CVPR},
	year={2019}
}

Follow-up work

Check out our new project, Scene Representation Networks, where we replace the voxel grid with a continuous function that naturally generalizes across scenes and smoothly parameterizes scene surfaces!

Submodule "pytorch_prototyping"

The code in the subdirectory "pytorch_prototyping" comes from a little library of custom pytorch modules that I use throughout my research projects. You can find it here.

Other cool projects

Some of the code in this project is based on code from these two very cool papers:

Check them out!

Contact:

If you have any questions, please email Vincent Sitzmann at [email protected].

Owner
Vincent Sitzmann
Incoming Assistant Professor @mit EECS. I'm researching neural scene representations - the way neural networks learn to represent information on our world.
Vincent Sitzmann
CountDown to New Year and shoot fireworks

CountDown and Shoot Fireworks About App This is an small application make you re

5 Dec 31, 2022
Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase

Ranger-Deep-Learning-Optimizer Ranger - a synergistic optimizer combining RAdam (Rectified Adam) and LookAhead, and now GC (gradient centralization) i

Less Wright 1.1k Dec 21, 2022
This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Skin Lesion detection using YOLO This project deal

Lalith Veerabhadrappa Badiger 1 Nov 22, 2021
Jittor Medical Segmentation Lib -- The assignment of Pattern Recognition course (2021 Spring) in Tsinghua University

THU模式识别2021春 -- Jittor 医学图像分割 模型列表 本仓库收录了课程作业中同学们采用jittor框架实现的如下模型: UNet SegNet DeepLab V2 DANet EANet HarDNet及其改动HarDNet_alter PSPNet OCNet OCRNet DL

48 Dec 26, 2022
Recursive Bayesian Networks

Recursive Bayesian Networks This repository contains the code to reproduce the results from the NeurIPS 2021 paper Lieck R, Rohrmeier M (2021) Recursi

Robert Lieck 11 Oct 18, 2022
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein

11 Nov 29, 2022
Hcpy - Interface with Home Connect appliances in Python

Interface with Home Connect appliances in Python This is a very, very beta inter

Trammell Hudson 116 Dec 27, 2022
Object recognition using Azure Custom Vision AI and Azure Functions

Step by Step on how to create an object recognition model using Custom Vision, export the model and run the model in an Azure Function

El Bruno 11 Jul 08, 2022
[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

AlignShift NEW: Code for our new MICCAI'21 paper "Asymmetric 3D Context Fusion for Universal Lesion Detection" will also be pushed to this repository

Medical 3D Vision 42 Jan 06, 2023
Good Semi-Supervised Learning That Requires a Bad GAN

Good Semi-Supervised Learning that Requires a Bad GAN This is the code we used in our paper Good Semi-supervised Learning that Requires a Bad GAN Ziha

Zhilin Yang 177 Dec 12, 2022
Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers

Implementation of ICCV 2021 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers arxiv This repository is based on detr Recently, DETR

twang 113 Dec 27, 2022
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

0 May 06, 2022
This repo contains implementation of different architectures for emotion recognition in conversations.

Emotion Recognition in Conversations Updates 🔥 🔥 🔥 Date Announcements 03/08/2021 🎆 🎆 We have released a new dataset M2H2: A Multimodal Multiparty

Deep Cognition and Language Research (DeCLaRe) Lab 1k Dec 30, 2022
This repository contain code on Novelty-Driven Binary Particle Swarm Optimisation for Truss Optimisation Problems.

This repository contain code on Novelty-Driven Binary Particle Swarm Optimisation for Truss Optimisation Problems. The main directory include the code

0 Dec 23, 2021
Pyramid addon for OpenAPI3 validation of requests and responses.

Validate Pyramid views against an OpenAPI 3.0 document Peace of Mind The reason this package exists is to give you peace of mind when providing a REST

Pylons Project 79 Dec 30, 2022
Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

MIT CSAIL Computer Vision 424 Dec 01, 2022
Convert game ISO and archives to CD CHD for emulation on Linux.

tochd Convert game ISO and archives to CD CHD for emulation. Author: Tuncay D. Source: https://github.com/thingsiplay/tochd Releases: https://github.c

Tuncay 20 Jan 02, 2023
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 08, 2022
Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch = 1.8.1 transformers

Wenxuan Zhou 74 Nov 29, 2022
A Tensorflow implementation of BicycleGAN.

BicycleGAN implementation in Tensorflow As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometim

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 97 Dec 02, 2022