PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Last update: Oct 02, 2021

Related tags

Overview

PN-Net

We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single channel image, we define it as the iso-surface of a scalar field in an implicit space, which we introduce as the Pseudo 3D Space. We convert a 3D Depth Field into a 2D depth image utilizing an efficient and differentiable sphere tracing rendering algorithm. We introduce two further innovations. First, we present a Field Warping technique that simplifies the depth field estimation as a classification problem, which is far more efficient to learn than a regression task of learning a signed distance function (SDF). Second, we design the 3D Pseudo Normal from the 2D depth map, which is closely related to the actual 3D surface normal and can be computed from the depth field's implicit representation with an uncalibrated camera. Experiments validated our method's performance. Our Pseudo 3D Space simplifies the current implicit field learning and offers a consistent framework for advancing shape reconstruction from multiple cues.

Set up dataset path

Suppose your dataset is placed like this:

/absolute_path/bts_nyu_data/
    sync/
        ...
    official_splits/
        train/
            ...
        test/
            ...

Add in ~/.bashrc the following

export PNNET_NYU2_DATASET=/absolute_path/bts_nyu_data/

Train with

python train_bts_nyu_nd3.py -c configs/train_bts_nyu_nd3_tb_vis.json

This include pseudo normal and total bending loss.

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Related tags

Overview

PN-Net

Set up dataset path

Train with

Owner

TimeSHAP explains Recurrent Neural Network predictions.

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition (AGRA, ACM 2020, Oral)

[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

RID-Noise: Towards Robust Inverse Design under Noisy Environments

Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

This is a file about Unet implemented in Pytorch

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

The Noise Contrastive Estimation for softmax output written in Pytorch

The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper.

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Supporting code for the Neograd algorithm

Framework web SnakeServer.

Code for Multimodal Neural SLAM for Interactive Instruction Following