Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Last update: Dec 10, 2022

Related tags

Overview

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis.
Xuanmeng Zhang, Zhedong Zheng, Daiheng Gao, Bang Zhang, Pan Pan, Yi Yang
CVPR 2022.

News:

[2022-04-30] We release the paper, video, code, and checkpoints.

Abstract

3D-aware image synthesis aims to generate images of objects from multiple views by learning a 3D representation. However, one key challenge remains: existing approaches lack geometry constraints, hence usually fail to generate multi-view consistent images. To address this challenge, we propose Multi-View Consistent Generative Adversarial Networks (MVCGAN) for high-quality 3D-aware image synthesis with geometry constraints. By leveraging the underlying 3D geometry information of generated images, i.e., depth and camera transformation matrix, we explicitly establish stereo correspondence between views to perform multi-view joint optimization. In particular, we enforce the photometric consistency between pairs of views and integrate a stereo mixup mechanism into the training process, encouraging the model to reason about the correct 3D shape. Besides, we design a two-stage training strategy with feature-level multi-view joint optimization to improve the image quality. Extensive experiments on three datasets demonstrate that MVCGAN achieves the state-of-the-art performance for 3D-aware image synthesis.

Please refer to the supplementary video for more visualization results.

Getting Started

Installation

Install dependencies by:

pip install -r requirements.txt

Datasets

Download CelebAHQ
Download FFHQ
Download AFHQv2

Pretrained Checkpoints

Dataset	Resolution	Download
CelebAHQ	512	Google Drive
FFHQ	512	Google Drive
AFHQ	512	Google Drive

Training

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python main.py --output_dir celebahq_exp --port 12361 --curriculum CelebAHQ

Please modify the configuration file curriculums.py to adjust to your own dataset path.

Rendering

CUDA_VISIBLE_DEVICES=0 python render_multiview_image.py --path ${CHECKPOINT_PATH} --output_dir render_dir --output_size 512 --curriculum FFHQ

Acknowledgment

Our implementation of MVCGAN is partly based on the following codebases. We gratefully thank the authors for their wonderful works: pi-gan, pytorch_GAN_zoo.

Citation

If you find our code or paper useful, please consider citing:

@inproceedings{zhang2022multiview,
  title={Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis},
  author={Zhang, Xuanmeng and Zheng, Zhedong and Gao, Daiheng and Zhang, Bang and Pan, Pan and Yang, Yi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Related tags

Overview

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis

News:

Abstract

Getting Started

Installation

Datasets

Pretrained Checkpoints

Training

Rendering

Acknowledgment

Citation

Owner

Xuanmeng Zhang

Experiments for distributed optimization algorithms

Reproduces ResNet-V3 with pytorch

SMD-Nets: Stereo Mixture Density Networks

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Deep Reinforcement Learning based autonomous navigation for quadcopters using PPO algorithm.

Where-Got-Time - An NUS timetable generator which uses a genetic algorithm to optimise timetables to suit the needs of NUS students

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

Some pvbatch (paraview) scripts for postprocessing OpenFOAM data

Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)

The second project in Python course on FCC

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

iris - Open Source Photos Platform Powered by PyTorch

The implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets.

Code for CVPR2019 Towards Natural and Accurate Future Motion Prediction of Humans and Animals

Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)