Finetune SSL models for MOS prediction

Overview

Finetune SSL models for MOS prediction

This is code for our paper under review for ICASSP 2022:

"Generalization Ability of MOS Prediction Networks" Erica Cooper, Wen-Chin Huang, Tomoki Toda, Junichi Yamagishi https://arxiv.org/abs/2110.02635

Please cite this preprint if you use this code.

Dependencies:

  • Fairseq toolkit: https://github.com/pytorch/fairseq Make sure you can import fairseq in Python.
  • torch, numpy, scipy, torchaudio
  • I have exported my conda environment for this project to environment.yml
  • You also need to download a pretrained wav2vec2 model checkpoint. These can be obtained here: https://github.com/pytorch/fairseq/tree/main/examples/wav2vec Please choose wav2vec_small.pt, w2v_large_lv_fsh_swbd_cv.pt, or xlsr_53_56k.pt.
  • You also need to have a MOS dataset. Datasets for the MOS prediction challenge will be released once the challenge starts. TODO update with a link.

How to use

  • Modify the paths in mos_fairseq.py to point to your own data and SSL checkpoints.
  • Run python mos_fairseq.py to finetune an SSL model on the data.
  • Modify variables in predict.py to point to your favorite checkpoint.
  • run predict.py to run inference using that checkpoint.

Acknowledgments

This study is supported by JST CREST grants JP- MJCR18A6, JPMJCR20D3, and JPMJCR19A3, and by MEXT KAKENHI grants 21K11951 and 21K19808. Thanks to the organizers of the Blizzard Challenge and Voice Conversion Challenge, and to Zhenhua Ling, Zhihang Xie, and Zhizheng Wu for answering our questions about past challenges. Thanks also to the Fairseq team for making their code and models available.

License

BSD 3-Clause License

Copyright (c) 2021, Yamagishi Laboratory, National Institute of Informatics All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  • Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Owner
Yamagishi and Echizen Laboratories, National Institute of Informatics
Yamagishi and Echizen Laboratories, National Institute of Informatics, Japan
Yamagishi and Echizen Laboratories, National Institute of Informatics
Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model

Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model Baris Gecer 1, Binod Bhattarai 1

Baris Gecer 190 Dec 29, 2022
[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

VITA 101 Dec 29, 2022
Code for the paper: Adversarial Machine Learning: Bayesian Perspectives

Code for the paper: Adversarial Machine Learning: Bayesian Perspectives This repository contains code for reproducing the experiments in the ** Advers

Roi Naveiro 2 Nov 11, 2022
3D-printable hand-strapped keyboard

Note: This repo has not been cleaned up and prepared for general consumption at all. This is just a dump of the project files. If there is any interes

Wojciech Baranowski 41 Dec 31, 2022
Texture mapping with variational auto-encoders

vae-textures This is an experiment with using variational autoencoders (VAEs) to perform mesh parameterization. This was also my first project using J

Alex Nichol 41 May 24, 2022
functorch is a prototype of JAX-like composable function transforms for PyTorch.

functorch is a prototype of JAX-like composable function transforms for PyTorch.

Facebook Research 1.2k Jan 09, 2023
Object Depth via Motion and Detection Dataset

ODMD Dataset ODMD is the first dataset for learning Object Depth via Motion and Detection. ODMD training data are configurable and extensible, with ea

Brent Griffin 172 Dec 21, 2022
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)

A simple API for working with University of California, Irvine (UCI) Machine Learning (ML) repository Table of Contents Introduction About Page of the

Tirthajyoti Sarkar 223 Dec 05, 2022
Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

EmotionUI Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI. demo screenshot (with RealSense) required packages Python = 3.6 num

Yang Jiao 2 Dec 23, 2021
PyElecCL - Electron Monte Carlo Second Checks

PyElecCL Python program to perform second checks for electron Monte Carlo radiat

Reese Haywood 3 Feb 22, 2022
🎯 A comprehensive gradient-free optimization framework written in Python

Solid is a Python framework for gradient-free optimization. It contains basic versions of many of the most common optimization algorithms that do not

Devin Soni 565 Dec 26, 2022
CATE: Computation-aware Neural Architecture Encoding with Transformers

CATE: Computation-aware Neural Architecture Encoding with Transformers Code for paper: CATE: Computation-aware Neural Architecture Encoding with Trans

16 Dec 27, 2022
Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

Shubham Tulsiani 24 Dec 17, 2022
A simple configurable bot for sending arXiv article alert by mail

arXiv-newsletter A simple configurable bot for sending arXiv article alert by mail. Prerequisites PyYAML=5.3.1 arxiv=1.4.0 Configuration All config

SXKDZ 21 Nov 09, 2022
OverFeat is a Convolutional Network-based image classifier and feature extractor.

OverFeat OverFeat is a Convolutional Network-based image classifier and feature extractor. OverFeat was trained on the ImageNet dataset and participat

593 Dec 08, 2022
Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

NodePiece - Compositional and Parameter-Efficient Representations for Large Knowledge Graphs NodePiece is a "tokenizer" for reducing entity vocabulary

Michael Galkin 107 Jan 04, 2023
[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。

QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,最短只需一行命令即可将普通的Python脚本打包成EXE可执行程序,并选择性添加CUDA和NoAVX的支持,尽可能兼容更多的用户环境。 感觉还可

QPT Family 545 Dec 28, 2022
An alarm clock coded in Python 3 with Tkinter

Tkinter-Alarm-Clock An alarm clock coded in Python 3 with Tkinter. Run python3 Tkinter Alarm Clock.py in a terminal if you have Python 3. NOTE: This p

CodeMaster7000 1 Dec 25, 2021
ML-Ensemble – high performance ensemble learning

A Python library for high performance ensemble learning ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framew

Sebastian Flennerhag 764 Dec 31, 2022
A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes.

OMNI A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes. Why? When I finished my Kubernetes cluster using a few Raspber

Matias Godoy 148 Dec 29, 2022