MonoScene: Monocular 3D Semantic Scene Completion

Overview

MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page]
Anh-Quan Cao, Raoul de Charette
Inria, Paris, France

If you find this work useful, please cite our paper:

@misc{cao2021monoscene,
      title={MonoScene: Monocular 3D Semantic Scene Completion}, 
      author={Anh-Quan Cao and Raoul de Charette},
      year={2021},
      eprint={2112.00726},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Code and models will be released soon. Please watch this repo for updates.

Demo

SemanticKITTI KITTI-360
(Trained on SemanticKITTI)

NYUv2

Comments
  • TypeError: 'int' object is not subscriptable

    TypeError: 'int' object is not subscriptable

    (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=2 batch_size=2 ^[[Dexp_kitti_1_FrusSize_8_nRelations4_WD0.0001_lr0.0001_CEssc_geoScalLoss_semScalLoss_fpLoss_CERel_3DCRP_Proj_2_4_8 n_relations (32, 32, 4) Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 118, in main class_weights=class_weights, File "/home/ruidong/workplace/MonoScene/monoscene/models/monoscene.py", line 80, in init context_prior=context_prior, File "/home/ruidong/workplace/MonoScene/monoscene/models/unet3d_kitti.py", line 62, in init self.feature * 4, self.feature * 4, size_l3, bn_momentum=bn_momentum File "/home/ruidong/workplace/MonoScene/monoscene/models/CRP3D.py", line 15, in init self.flatten_size = size[0] * size[1] * size[2] TypeError: 'int' object is not subscriptable

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 21
  • Questions about cross-entropy loss

    Questions about cross-entropy loss

    Dear authors, thanks for your great works! In your paper, you say that "the losses are computed only where y is defined". I wonder if this means you do not add supervision on non-occupied voxels and only use multi-class classification loss on occupied voxels ? If this holds true, why the model can identify which voxels are occupied ?

    opened by weiyithu 13
  • about test

    about test

    FileNotFoundError: [Errno 2] No such file or directory: '/home/ruidong/workplace/MonoScene/trained_models/monoscene_kitti.ckpt'

    the last printing of trainning is: Epoch 29: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2325/2325 [1:06:52<00:00, 1.73s/it, loss=3.89, v_num=]

    opened by DipDipPotatoChips 13
  • Cuda out of memory

    Cuda out of memory

    Dear author, you said that Use smaller 2D backbone by chaning the basemodel_name and num_features The pretrained model name is here. You can try the efficientnet B5 can reduces the memory, I want to know the B5 weight and the value of num_features?

    opened by lulianLiu 12
  • Pretrained models on other dataset: NuScenes

    Pretrained models on other dataset: NuScenes

    Hi @anhquancao,

    Thanks so much for your paper and your implementation. Do you have your pretrained model on the NuScenes? If yes, could you share it? The reason is that I want to build upon your work on the NuScenes dataset but there exists a large domain gap between the two (SemanticKITTI and NuScenes) so the pretrained on SemanticKITTI works does not well on the NuScenes.

    Thanks!

    opened by ducminhkhoi 11
  • failed to run test

    failed to run test

    When I try to run this script, it crashed without giving any information: python monoscene/scripts/generate_output.py +output_path=$MONOSCENE_OUTPUT dataset=kitti_360 +kitti_360_root=$KITTI_360_ROOT +kitti_360_sequence=2013_05_28_drive_0028_sync n_gpus=1 batch_size=1

    image

    Any suggestion will be much appreciated.

    opened by ChiyuanFeng 9
  • cannot find calib

    cannot find calib

    PS F:\Studying\CY-Workspace\MonoScene-master> python monoscene/scripts/eval_monoscene.py dataset=kitti kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS n_gpus=1 batch_size= 1 GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs n_relations 4 Using cache found in C:\Users\DELL/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master Loading base model ()...Done. Removing last two layers (global_pool & classifier). Building Encoder-Decoder model..Done. Traceback (most recent call last): File "monoscene/scripts/eval_monoscene.py", line 71, in main data_module.setup() File "F:\anaconda\envs\monoscene\lib\site-packages\pytorch_lightning\core\datamodule.py", line 440, in wrapped_fn fn(*args, **kwargs) File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dm.py", line 34, in setup color_jitter=(0.4, 0.4, 0.4), File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 60, in init os.path.join(self.root, "dataset", "sequences", sequence, "calib.txt") File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 193, in read_calib with open(calib_path, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'dataset\sequences\00\calib.txt'

    opened by cyaccpect 9
  • about visualization

    about visualization

    (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/visualization/kitti_vis_pred.py +file=/home/ruidong/workplace/MonoScene/outputs/kitti/08/000000.pkl +dataset=kitt monoscene/scripts/visualization/kitti_vis_pred.py:23: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations coords_grid = coords_grid.astype(np.float) Traceback (most recent call last): File "monoscene/scripts/visualization/kitti_vis_pred.py", line 196, in main d=7, File "monoscene/scripts/visualization/kitti_vis_pred.py", line 75, in draw grid_coords = np.vstack([grid_coords.T, voxels.reshape(-1)]).T AttributeError: 'tuple' object has no attribute 'T'

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 9
  • Porting the work of this paper to a new dataset

    Porting the work of this paper to a new dataset

    Hello author, first of all thank you for your great work. I want to directly apply your work to the nuscenes dataset, is it possible? Does the nuscenes dataset need point cloud data to assist in generating voxel data?

    opened by yukaizhou 8
  • Can you help me in another paper?

    Can you help me in another paper?

    Hello! Last year, when you reproduced the code SISC(https://github.com/OPEN-AIR-SUN/SISC), you found a bug and solve it! Now, I get the same problem too,can you tell me how to solve it ! Thank you very much!

    opened by WkangLiu 8
  • ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    there is something wrong with my machine and I reinstall my ubuntu. I re-gitclone the code and just keep the data.but when I follow the readme to do installation,it print:

    (monoscene) [email protected]:~/workplace/MonoScene$ pip install -e ./ Obtaining file:///home/potato/workplace/MonoScene Installing collected packages: monoscene Running setup.py develop for monoscene Successfully installed monoscene-0.0.0 (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=1 batch_size=1 sem_scal_loss=False Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 1, in from monoscene.data.semantic_kitti.kitti_dm import KittiDataModule File "/home/potato/workplace/MonoScene/monoscene/data/semantic_kitti/kitti_dm.py", line 3, in import pytorch_lightning as pl File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics, void File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

    opened by DipDipPotatoChips 7
Releases(v0.1)
Owner
Codes from Computer Vision group of RITS Team, Inria
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Ruiqi Gao 39 Nov 10, 2022
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 01, 2023
[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

[CVPR2021] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator Overview This is the entire codebase for the paper

35 Dec 01, 2022
SiT: Self-supervised vIsion Transformer

This repository contains the official PyTorch self-supervised pretraining, finetuning, and evaluation codes for SiT (Self-supervised image Transformer).

Sara Ahmed 275 Dec 28, 2022
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

Denis 29 Nov 21, 2022
Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Long-term-Motion-in-3D-Scenes This is an implementation of the CVPR'21 paper "Synthesizing Long-Term 3D Human Motion and Interaction in 3D". Please ch

Jiashun Wang 76 Dec 13, 2022
paper list in the area of reinforcenment learning for recommendation systems

paper list in the area of reinforcenment learning for recommendation systems

HenryZhao 23 Jun 09, 2022
Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"

README The code is based on the ILswiss. To run the code, use python run_experiment.py --nosrun -e your YAML file -g gpu id Generally, run_experim

ApexRL 12 Mar 19, 2022
An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

Kakao Brain 72 Dec 28, 2022
DL course co-developed by YSDA, HSE and Skoltech

Deep learning course This repo supplements Deep Learning course taught at YSDA and HSE @fall'21. For previous iteration visit the spring21 branch. Lec

Yandex School of Data Analysis 1.3k Dec 30, 2022
Pretraining Representations For Data-Efficient Reinforcement Learning

Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch

Mila 40 Dec 11, 2022
GyroSPD: Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices

GyroSPD Code for the paper "Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices" accepted at NeurIPS 2021. Re

Federico Lopez 12 Dec 12, 2022
A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8

19 Jul 26, 2022
[CVPR 2020] GAN Compression: Efficient Architectures for Interactive Conditional GANs

GAN Compression project | paper | videos | slides [NEW!] GAN Compression is accepted by T-PAMI! We released our T-PAMI version in the arXiv v4! [NEW!]

MIT HAN Lab 1k Jan 07, 2023
PURE: End-to-End Relation Extraction

PURE: End-to-End Relation Extraction This repository contains (PyTorch) code and pre-trained models for PURE (the Princeton University Relation Extrac

Princeton Natural Language Processing 657 Jan 09, 2023
This code is an implementation for Singing TTS.

MLP Singer This code is an implementation for Singing TTS. The algorithm is based on the following papers: Tae, J., Kim, H., & Lee, Y. (2021). MLP Sin

Heejo You 22 Dec 23, 2022
GRF: Learning a General Radiance Field for 3D Representation and Rendering

GRF: Learning a General Radiance Field for 3D Representation and Rendering [Paper] [Video] GRF: Learning a General Radiance Field for 3D Representatio

Alex Trevithick 243 Dec 29, 2022
Convnet transfer - Code for paper How transferable are features in deep neural networks?

How transferable are features in deep neural networks? This repository contains source code necessary to reproduce the results presented in the follow

Jason Yosinski 143 Sep 13, 2022
Restricted Boltzmann Machines in Python.

How to Use First, initialize an RBM with the desired number of visible and hidden units. rbm = RBM(num_visible = 6, num_hidden = 2) Next, train the m

Edwin Chen 928 Dec 30, 2022
PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

Zixuan Ke 176 Jan 05, 2023