Repository for MeshTalk supplemental material and code once the (already approved) 16 GHS captures our lab will make publicly available are released.

Related tags

Deep Learningmeshtalk
Overview

meshtalk

This repository contains code to run MeshTalk for face animation from audio. If you use MeshTalk, please cite

@inproceedings{richard2021meshtalk,
    author    = {Richard, Alexander and Zollh\"ofer, Michael and Wen, Yandong and de la Torre, Fernando and Sheikh, Yaser},
    title     = {MeshTalk: 3D Face Animation From Speech Using Cross-Modality Disentanglement},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {1173-1182}
}

Supplemental Material

Watch the video

Running MeshTalk

Dependencies

ffmpeg
numpy
torch         (tested with v1.10.0)
pytorch3d     (tested with v0.4.0)
torchaudio    (tested with v0.10.0)

Animating a Face Mesh from Audio

Download the pretrained models and unzip them. Make sure your python path contains the root directory (export PYTHONPATH=<your_meshtalk_root_directory>).

Then, run

python animate_face.py --model_dir <your_pretrained_model_dir> --audio_file <your_speech_snippet.wav> --output <your_output_file.mp4>

See a description of command line arguments via python animate_face.py --help. We provide a neutral face template mesh in assets/face_template.obj. Note that the rendered results look slightly different than in the paper and supplemental video because we use a differnt (open source) rendering engine in this repository.

Training your own MeshTalk version

We are in the process of releasing high-quality 3D face captures of 16 subjects (a subset of the dataset used in this paper). We will link to the dataset here once it is available.

License

The code and dataset are released under CC-NC 4.0 International license.

Comments
  • Can I change the OBJ model?

    Can I change the OBJ model?

    If I want to change an OBJ face, what are the requirements? Or is there a template for the face you use? Then you can create many faces through the template. I read other issues and learned that not all OBJ can be used. Does the number of vertices of the mesh need to be the same? Does the face size need to be the same?

    This is a cool project.

    opened by ALIENMINT 6
  • asset files creation

    asset files creation

    Hi, I ran the custom audio expressions on your neutral mesh object and it ran well. I wanted to run the audio on my own custom (model)object files. I have created the object files for my person model. For this how do I generate the asset files - face_mean.npy, face_std forehead_mask and neck mask files? Are these files generated for the object file, or am i supposed to resize the object file to the 6172 dimension in order to use with the existing asset files? Thank you for your help in advance.

    opened by programmeddeath1 6
  • new obj

    new obj

    i have a new obj file with 6172 points from the default obj file, Q1:what is the meaning of the file face_mean and face_std and the two txt with smoothing ? Is the middle face and the hyperbole face ? Q2: how to make the face_mean and face_std and the smooth txt file?

    opened by luoww1992 5
  • Training parameters

    Training parameters

    Hello,

    I am trying to train MeshTalk on the VOCA dataset, however, the loss value explodes if I use a learning rate 1e-4 or higher, and keeps oscillating in the range of 0.2 if I use a lower learning rate (this does not lead to realistic results). I was wondering what training parameters were used in the paper?

    I am using the following parameters: no. of frames, T = 128 optimizer SGD with lr=9e-5 (at the moment), momentum=0.9, nesterov=True M_upper = 5 and M_lower = 5 batch_size = 16

    Thanks for any help!

    opened by UttaranB127 5
  • mesh faces missing for multiface

    mesh faces missing for multiface

    The mesh graph (.obj) multiface provided has almost 2000 faces less than the mesh by meshtalk (.obj). I wonder how to cope with it. Should I do some remeshing work to connect the isolated vertices together?

    opened by songtoy 4
  • How to use diffrent obj model?

    How to use diffrent obj model?

    Incredible work!Thanks! I have a question on using diffrent obj model. I tried to use obj model file created by deca, but meet a error:

    (meshtalk) [email protected]:/data/cx/GANs/meshtalk$ python animate_face.py --model_dir weights/pretrained_models --audio_file test.wav --output outputs --face_template myasset/mzd.obj /home/ubuntu/.local/lib/python3.8/site-packages/torchaudio/backend/utils.py:53: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail. warnings.warn( load assets... load models... Loaded: weights/pretrained_models/vertex_unet.pkl Loaded: weights/pretrained_models/context_model.pkl Loaded: weights/pretrained_models/encoder.pkl animate face mesh... /home/ubuntu/.local/lib/python3.8/site-packages/torch/functional.py:515: UserWarning: stft will require the return_complex parameter be explicitly specified in a future PyTorch release. Use return_complex=False to preserve the current behavior or return_complex=True to return a complex output. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:653.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore /home/ubuntu/.local/lib/python3.8/site-packages/torch/functional.py:515: UserWarning: The function torch.rfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.fft or torch.fft.rfft. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:590.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore Traceback (most recent call last): File "animate_face.py", line 93, in geom = template_verts.cuda().view(1, 1, 6172, 3).expand(-1, T, -1, -1).contiguous() RuntimeError: shape '[1, 1, 6172, 3]' is invalid for input of size 15069

    What should I do if I want to animate different obj files?

    opened by AdamMayor2018 4
  • Different topology from multiface dataset?

    Different topology from multiface dataset?

    I find that the number of vertices from your given template object is different from what I downloaded from multiface dataset. Especially the details of the mouth are quite different, would you please share more information about the experiments?

    opened by chenerg 3
  • Context model - how to train?

    Context model - how to train?

    Hello,

    How to train the autoregressive model for inference? In the forward function, what would be the first expression_one_hot tensor? I understand subsequent inputs would be the labels output of previous timestep.

    `def forward(self, expression_one_hot: th.Tensor, audio_code: th.Tensor):

       x = self.embedding(expression_one_hot)
    
        for layer in self.context_layers:
            x = layer(x, audio_code)
            x = F.leaky_relu(x, 0.2)
    
        logits = self.logits(x)
        logprobs = F.log_softmax(logits, dim=-1)
        probs = F.softmax(logprobs, dim=-1)
        labels = th.argmax(logprobs, dim=-1)
    
        return {"logprobs": logprobs, "probs": probs, "labels": labels}` 
    

    Thanks

    opened by karthik-mohankumar 3
  • Do you have any uv texture mapping files?

    Do you have any uv texture mapping files?

    Hi. I am very impressed with your wonderful research. Thank you so much for sharing the great results. I want to render a texture to the output generated by this model. Can I get a uv texture mapping file that matches the output?

    opened by shovelingpig 3
  • Audio features are different from your paper statement

    Audio features are different from your paper statement

    Hi, I found the audio preprocessing use simple transformation in your codes (load_audio & audio_chunking). But there are different from your statement in paper where the paper says"Our audio data is recorded at 16kHz. For each tracked mesh, we compute the Mel spectrogram of a 600ms audio snippet starting 500ms before and ending 100ms after the respective visual frame. We extract 80-dimensional Mel spectral features every 10ms, using 1, 024 frequency bins and a window size of 800 for the underlying Fourier transform."

    I didn't find any Mel spectral calculation in your code, why there are different? Is the current version is better than Mel spectral features?

    opened by kjhgfdsaas 3
  • Build pytorch3d 0.4.0 failed with torch1.10

    Build pytorch3d 0.4.0 failed with torch1.10

    I try to build pytorch3d 0.4.0 source with torch1.10 as same version as readme. But it always failed. The log is below:

    /home/local/gcc-5.3.0/bin/gcc -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -I/home/Projects/github_projects/pytorch3d/pytorch3d/csrc -I/home/software_packages/cub-1.10.0 -I/home/anaconda3/envs/torch1.10/lib/python3.7/site-packages/torch/include -I/home/anaconda3/envs/torch1.10/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/anaconda3/envs/torch1.10/lib/python3.7/site-packages/torch/include/TH -I/home/anaconda3/envs/torch1.10/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-10.2/include -I/home/anaconda3/envs/torch1.10/include/python3.7m -c /home/Projects/github_projects/pytorch3d/pytorch3d/csrc/rasterize_meshes/rasterize_meshes_cpu.cpp -o build/temp.linux-x86_64-3.7/home/Projects/github_projects/pytorch3d/pytorch3d/csrc/rasterize_meshes/rasterize_meshes_cpu.o -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
      cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
      /home/Projects/github_projects/pytorch3d/pytorch3d/csrc/rasterize_meshes/rasterize_meshes_cpu.cpp: In function ‘std::tuple<at::Tensor, at::Tensor, at::Tensor, at::Tensor> RasterizeMeshesNaiveCpu(const at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, std::tuple<int, int>, float, int, bool, bool, bool)’:
      /home/Projects/github_projects/pytorch3d/pytorch3d/csrc/rasterize_meshes/rasterize_meshes_cpu.cpp:294:28: error: converting to ‘std::tuple<float, int, float, float, float, float>’ from initializer list would use explicit constructor ‘constexpr std::tuple< <template-parameter-1-1> >::tuple(_UElements&& ...) [with _UElements = {const float&, int&, const float&, const float&, const float&, const float&}; <template-parameter-2-2> = void; _Elements = {float, int, float, float, float, float}]’
                     q[idx_top_k] = {
                                  ^
      error: command '/home/local/gcc-5.3.0/bin/gcc' failed with exit status 1
      Building wheel for pytorch3d (setup.py) ... error
      ERROR: Failed building wheel for pytorch3d
    

    Dose pytorch3d 0.4.0 really support torch1.10? I see the requirement is less than 1.7.1 in pytorch3d 0.4.0 url and less than 1.9.1 in pytorch3d main url

    My environment:

    • centos 7
    • gcc 5.3.0
    • cuda 10.2
    • cub 1.10
    • python 3.7 (conda environment)
    • torch1.10
    • pytorch3d 0.4.0
    opened by wikiwen 3
  • Which data was used for the pre-trained model

    Which data was used for the pre-trained model

    Hi! The paper mentions the following:

    We release a subset of 16 subjects of this dataset and our model using only these subjects as a baseline to compare against

    Since multiface was release with only 13 identities, can you please confirm what was used for the released pre-trained model? (e.g. the 13 identities in multiface? Those plus 3 other identities? Or another set of 16 identities?)

    Thank you!

    opened by luizgh 0
Releases(pretrained_models_v1.0)
Owner
Meta Research
Meta Research
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection

Behind the Curtain: Learning Occluded Shapes for 3D Object Detection Acknowledgement We implement our model, BtcDet, based on [OpenPcdet 0.3.0]. Insta

Qiangeng Xu 163 Dec 19, 2022
Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Explainability Requires Interactivity This repository contains the code to train all custom models used in the paper Explainability Requires Interacti

Digital Health & Machine Learning 5 Apr 07, 2022
Unicorn can be used for performance analyses of highly configurable systems with causal reasoning

Unicorn can be used for performance analyses of highly configurable systems with causal reasoning. Users or developers can query Unicorn for a performance task.

AISys Lab 27 Jan 05, 2023
Classify the disease status of a plant given an image of a passion fruit

Passion Fruit Disease Detection I tried to create an accurate machine learning models capable of localizing and identifying multiple Passion Fruits in

3 Nov 09, 2021
Koopman operator identification library in Python

pykoop pykoop is a Koopman operator identification library written in Python. It allows the user to specify Koopman lifting functions and regressors i

DECAR Systems Group 34 Jan 04, 2023
Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Debiasing Item-to-Item Recommendations With Small Annotated Datasets This is the code for our RecSys '20 paper. Other materials can be found here: Ful

Microsoft 34 Aug 10, 2022
TyXe: Pyro-based BNNs for Pytorch users

TyXe: Pyro-based BNNs for Pytorch users TyXe aims to simplify the process of turning Pytorch neural networks into Bayesian neural networks by leveragi

87 Jan 03, 2023
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022
style mixing for animation face

An implementation of StyleGAN on Animation dataset. Install git clone https://github.com/MorvanZhou/anime-StyleGAN cd anime-StyleGAN pip install -r re

Morvan 46 Nov 30, 2022
State of the Art Neural Networks for Deep Learning

pyradox This python library helps you with implementing various state of the art neural networks in a totally customizable fashion using Tensorflow 2

Ritvik Rastogi 60 May 29, 2022
ViSD4SA, a Vietnamese Span Detection for Aspect-based sentiment analysis dataset

UIT-ViSD4SA PACLIC 35 General Introduction This repository contains the data of the paper: Span Detection for Vietnamese Aspect-Based Sentiment Analys

Nguyễn Thị Thanh Kim 5 Nov 13, 2022
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

136 Dec 12, 2022
PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambig

王皓波 147 Jan 07, 2023
Stacked Generative Adversarial Networks

Stacked Generative Adversarial Networks This repository contains code for the paper "Stacked Generative Adversarial Networks", CVPR 2017. Part of the

Xun Huang 241 May 07, 2022
Keras implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 8.9k Jan 04, 2023
PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)

PointCNN: Convolution On X-Transformed Points Created by Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. Introduction PointCNN

Yangyan Li 1.3k Dec 21, 2022
Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Machine Learning University: Accelerated Tabular Data Class This repository contains slides, notebooks, and datasets for the Machine Learning Universi

AWS Samples 916 Dec 23, 2022
Place holder for HOPE: a human-centric and task-oriented MT evaluation framework using professional post-editing

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation Place holder for dat

Lifeng Han 1 Apr 25, 2022
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and ap

3.4k Jan 04, 2023
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training The Unreasonable Effectiveness of

VITA 44 Dec 23, 2022