Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Overview

NeX: Real-time View Synthesis with Neural Basis Expansion

Project Page | Video | Paper | COLAB | Shiny Dataset

Open NeX in Colab

NeX

We present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce NeXt-level view-dependent effects---in real time. Unlike traditional MPI that uses a set of simple RGBα planes, our technique models view-dependent effects by instead parameterizing each pixel as a linear combination of basis functions learned from a neural network. Moreover, we propose a hybrid implicit-explicit modeling strategy that improves upon fine detail and produces state-of-the-art results. Our method is evaluated on benchmark forward-facing datasets as well as our newly-introduced dataset designed to test the limit of view-dependent modeling with significantly more challenging effects such as the rainbow reflections on a CD. Our method achieves the best overall scores across all major metrics on these datasets with more than 1000× faster rendering time than the state of the art.

Table of contents



Getting started

conda env create -f environment.yml
./download_demo_data.sh
conda activate nex
python train.py -scene data/crest_demo -model_dir crest -http
tensorboard --logdir runs/

Installation

We provide environment.yml to help you setup a conda environment.

conda env create -f environment.yml

Dataset

Shiny dataset

Download: Shiny dataset.

We provide 2 directories named shiny and shiny_extended.

  • shiny contains benchmark scenes used to report the scores in our paper.
  • shiny_extended contains additional challenging scenes used on our website project page and video

NeRF's real forward-facing dataset

Download: Undistorted front facing dataset

For real forward-facing dataset, NeRF is trained with the raw images, which may contain lens distortion. But we use the undistorted images provided by COLMAP.

However, you can try running other scenes from Local lightfield fusion (Eg. airplant) without any changes in the dataset files. In this case, the images are not automatically undistorted.

Deepview's spaces dataset

Download: Modified spaces dataset

We slightly modified the file structure of Spaces dataset in order to determine the plane placement and split train/test sets.

Using your own images.

Running NeX on your own images. You need to install COLMAP on your machine.

Then, put your images into a directory following this structure

<scene_name>
|-- images
     | -- image_name1.jpg
     | -- image_name2.jpg
     ...

The training code will automatically prepare a scene for you. You may have to tune planes.txt to get better reconstruction (see dataset explaination)

Training

Run with the paper's config

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http

This implementation uses scikit-image to resize images during training by default. The results and scores in the paper are generated using OpenCV's resize function. If you want the same behavior, please add -cv2resize argument.

Note that this code is tested on an Nvidia V100 32GB and 4x RTX 2080Ti GPU.

For a GPU/GPUs with less memory (e.g., a single RTX 2080Ti), you can run using the following command:

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http -layers 12 -sublayers 6 -hidden 256

Note that when your GPU runs ouut of memeory, you can try reducing the number of layers, sublayers, and sampled rays.

Rendering

To generate a WebGL viewer and a video result.

python train.py -scene ${scene} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -predict -http

Video rendering

To generate a video that matches the real forward-facing rendering path, add -nice_llff argument, or -nice_shiny for shiny dataset

Citation

@inproceedings{Wizadwongsa2021NeX,
    author = {Wizadwongsa, Suttisak and Phongthawee, Pakkapon and Yenphraphai, Jiraphon and Suwajanakorn, Supasorn},
    title = {NeX: Real-time View Synthesis with Neural Basis Expansion},
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year = {2021},
}

Visit us 🦉

Vision & Learning Laboratory VISTEC - Vidyasirimedhi Institute of Science and Technology

Sentello is python script that simulates the anti-evasion and anti-analysis techniques used by malware.

sentello Sentello is a python script that simulates the anti-evasion and anti-analysis techniques used by malware. For techniques that are difficult t

Malwation 62 Oct 02, 2022
Lingtrain Aligner — ML powered library for the accurate texts alignment.

Lingtrain Aligner ML powered library for the accurate texts alignment in different languages. Purpose Main purpose of this alignment tool is to build

Sergei Averkiev 76 Dec 14, 2022
Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

lang lang is a simple stack based programming language written in Python. It can

Christoffer Aakre 1 May 30, 2022
In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews Abstract Sentiment analysis has made great progress in recent years, due to th

Alexander Leonardo Lique Lamas 5 Jan 03, 2022
Understand Text Summarization and create your own summarizer in python

Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document. Technologies that can make a coherent

Sreekanth M 1 Oct 18, 2022
Ongoing research training transformer language models at scale, including: BERT & GPT-2

What is this fork of Megatron-LM and Megatron-DeepSpeed This is a detached fork of https://github.com/microsoft/Megatron-DeepSpeed, which in itself is

BigScience Workshop 316 Jan 03, 2023
Repository for the paper: VoiceMe: Personalized voice generation in TTS

🗣 VoiceMe: Personalized voice generation in TTS Abstract Novel text-to-speech systems can generate entirely new voices that were not seen during trai

Pol van Rijn 80 Dec 29, 2022
A tool helps build a talk preview image by combining the given background image and talk event description

talk-preview-img-builder A tool helps build a talk preview image by combining the given background image and talk event description Installation and U

PyCon Taiwan 4 Aug 20, 2022
超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新

bert4pytorch 2021年8月27更新: 感谢大家的star,最近有小伙伴反映了一些小的bug,我也注意到了,奈何这个月工作上实在太忙,更新不及时,大约会在9月中旬集中更新一个只需要pip一下就完全可用的版本,然后会新添加一些关键注释。 再增加对抗训练的内容,更新一个完整的finetune

muqiu 317 Dec 18, 2022
iBOT: Image BERT Pre-Training with Online Tokenizer

Image BERT Pre-Training with iBOT Official PyTorch implementation and pretrained models for paper iBOT: Image BERT Pre-Training with Online Tokenizer.

Bytedance Inc. 435 Jan 06, 2023
A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul

Snm Logic 1 Dec 20, 2021
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the

Galois Autocompleter 91 Sep 23, 2022
Question answering app is used to answer for a user given question from user given text.

Question answering app is used to answer for a user given question from user given text.It is created using HuggingFace's transformer pipeline and streamlit python packages.

Siva Prakash 3 Apr 05, 2022
A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

List Of English Words A text file containing over 466k English words. While searching for a list of english words (for an auto-complete tutorial) I fo

dwyl 8.5k Jan 03, 2023
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

Ryan Spring 114 Nov 04, 2022
Fully featured implementation of Routing Transformer

Routing Transformer A fully featured implementation of Routing Transformer. The paper proposes using k-means to route similar queries / keys into the

Phil Wang 246 Jan 02, 2023
Open-source offline translation library written in Python. Uses OpenNMT for translations

Open source neural machine translation in Python. Designed to be used either as a Python library or desktop application. Uses OpenNMT for translations and PyQt for GUI.

Argos Open Tech 1.6k Jan 01, 2023
Must-read papers on improving efficiency for pre-trained language models.

Must-read papers on improving efficiency for pre-trained language models.

Tobias Lee 89 Jan 03, 2023
Code for EMNLP20 paper: "ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training"

ProphetNet-X This repo provides the code for reproducing the experiments in ProphetNet. In the paper, we propose a new pre-trained language model call

Microsoft 394 Dec 17, 2022
keras implement of transformers for humans

keras implement of transformers for humans

苏剑林(Jianlin Su) 4.8k Jan 03, 2023