Self-Supervised Speech Pre-training and Representation Learning Toolkit.

Overview



MIT License CC_BY_NC License Build Codecov Bitbucket open issues

What's New

  • Sep 2021: We host a challenge in AAAI workshop: The 2nd Self-supervised Learning for Audio and Speech Processing! See SUPERB official site for the challenge details and the SUPERB documentation in this toolkit!
  • Aug 2021: We now have a tutorial that introduces our toolkit, you can watch it on Youtube!
  • July 2021: We are now working on packaging s3prl and reorganizing the file structure in v0.3. Please consider using the stable v0.2.0 for now. We will test and release v0.3 before August.
  • June 2021: Support SUPERB: Speech processing Universal PERformance Benchmark, submitted to Interspeech 2021. Use the tag superb-interspeech2021 or v0.2.0.
  • June 2021: Support extracting multiple hidden states from the SSL pretrained models
  • Jan 2021: Readme updated with detailed instructions on how to use our latest version!
  • Dec 2020: We are migrating to a newer version for a more general, flexible, and scalable code. See the introduction below for more information! The legacy version can be accessed the tag v0.1.0.

Introduction and Usages

This is an open source toolkit called s3prl, which stands for Self-Supervised Speech Pre-training and Representation Learning. Self-supervised speech pre-trained models are called upstream in this toolkit, and are utilized in various downstream tasks.

The toolkit has three major usages:

Pretrain

  • Pretrain upstream models, including Mockingjay, Audio ALBERT and TERA.
  • Document: pretrain/README.md

Upstream

  • Easily load most of the existing upstream models with pretrained weights in a unified I/O interface.
  • Pretrained models are registered through torch.hub, which means you can use these models in your own project by one-line plug-and-play without depending on this toolkit's coding style.
  • Document: upstream/README.md

Downstream

Below is an intuitive illustration on how this toolkit may help you:



Feel free to use or modify our toolkit in your research. Here is a list of papers using our toolkit. Any question, bug report or improvement suggestion is welcome through opening up a new issue.

If you find this toolkit helpful to your research, please do consider citing our papers, thanks!

Installation

  1. Python >= 3.6
  2. Install sox on your OS
  3. Install s3prl
pip install -e ./
  1. Install the specific fairseq
pip install [email protected]+https://github.com//pytorch/[email protected]#egg=fairseq
  1. Some upstream models require special dependencies. If you encounter error with a specific upstream model, you can look into the README.md under each upstream folder. E.g., upstream/pase/README.md

Development pattern for contributors

  1. Create a personal fork of the main S3PRL repository in GitHub.
  2. Make your changes in a named branch different from master, e.g. you create a branch new-awesome-feature.
  3. Contact us if you have any questions during development.
  4. Generate a pull request through the Web interface of GitHub.
  5. Please verify that your code is free of basic mistakes, we appreciate any contribution!

Reference Repositories

License

The majority of S3PRL Toolkit is licensed under CC-BY-NC, however portions of the project are available under separate license terms: S3PRL is licensed under the MIT license.

Used by

List of papers that used our toolkit (Feel free to add your own paper by making a pull request)

Self-Supervised Pretraining

Explanability

Adversarial Attack

Voice Conversion

Benchmark and Evaluation

  • SUPERB: Speech processing Universal PERformance Benchmark (Yang et al., 2021)

    @misc{superb,
          title={SUPERB: Speech processing Universal PERformance Benchmark}, 
          author={Shu-wen Yang and Po-Han Chi and Yung-Sung Chuang and Cheng-I Jeff Lai and Kushal Lakhotia and Yist Y. Lin and Andy T. Liu and Jiatong Shi and Xuankai Chang and Guan-Ting Lin and Tzu-Hsien Huang and Wei-Cheng Tseng and Ko-tik Lee and Da-Rong Liu and Zili Huang and Shuyan Dong and Shang-Wen Li and Shinji Watanabe and Abdelrahman Mohamed and Hung-yi Lee},
          year={2021},
          eprint={2105.01051},
          archivePrefix={arXiv},
          primaryClass={cs.CL}
    }
    
  • Utilizing Self-supervised Representations for MOS Prediction (Tseng et al., 2021)

    @misc{ssr_mos,
        title={Utilizing Self-supervised Representations for MOS Prediction}, 
        author={Wei-Cheng Tseng and Chien-yu Huang and Wei-Tsung Kao and Yist Y. Lin and Hung-yi Lee},
        year={2021},
        eprint={2104.03017},
        archivePrefix={arXiv},
        primaryClass={eess.AS}
    }
    

}

Citation

If you find this toolkit useful, please consider citing following papers.

  • If you use our pre-training scripts, or the downstream tasks considered in TERA and Mockingjay, please consider citing the following:
@misc{tera,
  title={TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech},
  author={Andy T. Liu and Shang-Wen Li and Hung-yi Lee},
  year={2020},
  eprint={2007.06028},
  archivePrefix={arXiv},
  primaryClass={eess.AS}
}
@article{mockingjay,
   title={Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders},
   ISBN={9781509066315},
   url={http://dx.doi.org/10.1109/ICASSP40776.2020.9054458},
   DOI={10.1109/icassp40776.2020.9054458},
   journal={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
   publisher={IEEE},
   author={Liu, Andy T. and Yang, Shu-wen and Chi, Po-Han and Hsu, Po-chun and Lee, Hung-yi},
   year={2020},
   month={May}
}
  • If you use our organized upstream interface and features, or the SUPERB downstream benchmark, please consider citing the following:
@inproceedings{yang21c_interspeech,
  author={Shu-wen Yang and Po-Han Chi and Yung-Sung Chuang and Cheng-I Jeff Lai and Kushal Lakhotia and Yist Y. Lin and Andy T. Liu and Jiatong Shi and Xuankai Chang and Guan-Ting Lin and Tzu-Hsien Huang and Wei-Cheng Tseng and Ko-tik Lee and Da-Rong Liu and Zili Huang and Shuyan Dong and Shang-Wen Li and Shinji Watanabe and Abdelrahman Mohamed and Hung-yi Lee},
  title={{SUPERB: Speech Processing Universal PERformance Benchmark}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1194--1198},
  doi={10.21437/Interspeech.2021-1775}
}
Comments
  • module 'hub' has no attribute 'mockingjay_local'

    module 'hub' has no attribute 'mockingjay_local'

    Hello. I am trying to run the Mockingjay downstream task using this command python run_downstream.py -m train -u mockingjay_local -k '<path to .ckpt>' -d phone_linear -n mockingjayDown. on an HPC. I am getting the following error:

      File "run_downstream.py", line 225, in <module>
        main()
      File "run_downstream.py", line 220, in main
        runner = Runner(args, config)
      File "<path>/s3prl/downstream/runner.py", line 103, in __init__
        self.upstream = self._get_upstream()
      File "<path>/s3prl/downstream/runner.py", line 143, in _get_upstream
        Upstream = getattr(hub, self.args.upstream)
    AttributeError: module 'hub' has no attribute 'mockingjay_local'
    

    Please let me know how to resolve the issue or if I need to provide more details. Thanks!

    opened by MiPlayer123 20
  • Speaker Diarization Scoring

    Speaker Diarization Scoring

    Add NIST scoring for standard diarization error rate (der)

    The results on three models (upstream + downstream):

    1. baseline(fbank) + rnn 7.03
    2. apc + rnn 7.20
    3. wav2vec2 + rnn 4.36
    opened by ftshijt 20
  • There are tasks that ESPNET does with S3PRL that fail

    There are tasks that ESPNET does with S3PRL that fail

    File "/media/shiyanshi/E/espnet/espnet2/asr/frontend/s3prl.py", line 26, in init import s3prl.nn ModuleNotFoundError: No module named 's3prl.nn' Error: S3PRL is not properly installed. Please install S3PRL: cd ${MAIN_ROOT}/tools && make s3prl.done

    But S3PRL is successfully installed and can also be imported successfully in the terminal,How do I fix it?

    enhancement 
    opened by abcdbosh 18
  • Upstream request: wavLM

    Upstream request: wavLM

    I see WavLM now topped all of the SUPERB tasks (10 tasks). So, I would like to request to add this audio embedding to upstream.

    Paper: https://arxiv.org/pdf/2110.13900.pdf Code/Model: https://github.com/microsoft/unilm/tree/master/wavlm

    Currently, only base and base+ models are available; the large version will be added soon.

    opened by bagustris 16
  • The model rewrite in config is not reflected

    The model rewrite in config is not reflected

    Hi, thank you for a great repository!

    I'm running a downstream task in ER. I wanted to change the neural network CNNselfAttention to FCN, so I ran the following, but the network doesn't seem to have changed. It is reflected in the config*.yaml in /result/downstream/ExpName. But the training results are the same as the default (CNNSelfattention)

    ・The code I ran python3 run_downstream.py -n ExpName -m train -u fbank -d emotion -c downstream/emotion/config.yaml -o "config.downstream_expert.modelrc.DeepModel.model_type='FCN'"

    Excuse me, how can I change this to FCN?

    opened by miyazakieiji 16
  • Why is such a large memory cost on gpu

    Why is such a large memory cost on gpu

    Hello! I was tring to run an experiment of "Hubert + PR" using single gpu. I have noticed it that the task cost nearly 40+G memory on gpu when I start training. After training for some time, it has reported "cuda out of memory" and I have to stop the task. I encountered similar situation when I run the experiment of "Wavlm + ASR", which cost about 30G memory. Such a large memory cost didn't appear in other downstream tasks such as KS, IC. I ran all the experiments with a default config.yaml. So why does the task use so much memory? Is it normal?

    opened by TCL606 15
  • Error while loading finetuned wav2vec 2.0 large

    Error while loading finetuned wav2vec 2.0 large

    Hi, As per the ppt I try to load wav2vec2 with the following code and get the following error:

    upstream = torch.hub.load("s3prl/s3prl",'wav2vec2_url',ckpt = 'https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_vox_new.pt') Using cache found in /home/sreyan/.cache/torch/hub/s3prl_s3prl_master Using cache found in /home/sreyan/.cache/torch/hub/s3prl_cache/1c76d6e88090f01736036b28dc995fef583f47f42662d55286332557f957609f for https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_vox_new.pt Traceback (most recent call last): File "", line 1, in File "/home/sreyan/.conda/envs/semeval/lib/python3.7/site-packages/torch/hub.py", line 370, in load model = _load_local(repo_or_dir, model, *args, **kwargs) File "/home/sreyan/.conda/envs/semeval/lib/python3.7/site-packages/torch/hub.py", line 399, in _load_local model = entry(*args, **kwargs) File "/home/sreyan/.cache/torch/hub/s3prl_s3prl_master/upstream/wav2vec2/hubconf.py", line 23, in wav2vec2_url return wav2vec2_local(_urls_to_filepaths(ckpt, refresh=refresh), *args, **kwargs) File "/home/sreyan/.cache/torch/hub/s3prl_s3prl_master/upstream/wav2vec2/hubconf.py", line 14, in wav2vec2_local return _UpstreamExpert(ckpt, *args, **kwargs) File "/home/sreyan/.cache/torch/hub/s3prl_s3prl_master/upstream/wav2vec2/expert.py", line 24, in init model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([ckpt]) File "/home/sreyan/fairseq/fairseq/checkpoint_utils.py", line 339, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "/home/sreyan/fairseq/fairseq/checkpoint_utils.py", line 273, in load_checkpoint_to_cpu state = _upgrade_state_dict(state) File "/home/sreyan/fairseq/fairseq/checkpoint_utils.py", line 550, in _upgrade_state_dict state["cfg"] = convert_namespace_to_omegaconf(state["args"]) File "/home/sreyan/fairseq/fairseq/dataclass/utils.py", line 351, in convert_namespace_to_omegaconf with initialize(config_path=config_path): AttributeError: enter

    I would want to finetune finetuned wav2vec 2.0 on speech sentiment task. Any help would be highly appreciated.

    opened by Sreyan88 14
  • about distilhubert

    about distilhubert

    when I run "python run_pretrain.py -u distiller -g pretrain/distiller/config_model.yaml -n distilhubert";

    I got error " File "/home/wangsiyuan/kaldi-wavlm/s3prl-test/s3prl/pretrain/distiller/pretrain_expert.py", line 278, in forward teacher_hiddens = torch.stack(teacher_hiddens, dim=1) # B x N x T x D RuntimeError: stack expects each tensor to be equal size, but got [18, 302, 768] at entry 0 and [18, 301, 768] at entry 1"

    Tests have shown that,The teacher model has 12 blocks, the 12th block is one frame away from the other blocks;

    After Padding,another error occur , the compute loss denote that student model output is one frame away from the output of teacher model........

    Other error: when I use multi GPU ,I got "IndexError: Caught IndexError in replica 0 on device 0." I use torch 1.9.0 or 1.10.1 +cu111,can not fix it

    opened by c976237222 13
  • Integrate Hugging Face Hub & add Docker image

    Integrate Hugging Face Hub & add Docker image

    This PR implements two main features:

    Integration with the 🤗 Hub for downstream fine-tuning.

    The --hub flag allows users to pick any (suitable) upstream model from the PyTorch or 🤗 Hubs, while the --push_to_hf_hub flag pushes all the artifacts from fine-tuning to the 🤗 Hub for inference / evaluation.

    A fine-tuning run with these flags looks like:

    python run_downstream.py -n exp_dir -m train -u ${upstream_model} -d ${downstream_task} --hub huggingface --push_to_hf_hub True
    

    Upstream models on the 🤗 Hub require an expert.py interface to be defined and you can find an example here.

    Downstream models are automatically wrapped in a model.py file that defines the interface for inference and you can find an example here. By default we use the *best*.ckpt checkpoint for inference / evaluation and fall back to the final checkpoint if a "best" one is not produced during training.

    By storing all the artifacts, we can visualize the Tensorboard logs and reproduce training runs if needed from the args_*.yaml and config_*.yaml files.

    Update: the tensorboard logs are only visible for public repos and by default we create a private repo (in case participants don't want to share their fine-tuned models with everyone). The participant can view the logs by simply making their repo public if they wish

    A Docker image for downstream fine-tuning

    This builds on the above Hub integration and should be runnable on any infra that has the NVIDIA Container Toolkit installed. See the downstream README for more details on how to build the image / run it. Once this PR is merged, an interesting exercise will be to see if you can run the Docker container on your own infra 😃

    Miscellaneous

    We have also included some changes to:

    • The downstream README
    • The ASR and SD modules now include a template folder for the 🤗 Hub interfaces

    cc @leo19941227

    opened by lewtun 13
  • train downstream ASR using own upstream

    train downstream ASR using own upstream

    Hi, I want to use the pertained model for downstream ASR task, however in the s3prl/downstream/asr/feat/ directory, there is no config file, is the ASR task properly configured? Thanks.

    opened by zyzpower 13
  • (WIP) a better version of enhancement and separation downstream

    (WIP) a better version of enhancement and separation downstream

    Hi @leo19941227 , I am making the pull request for a better version of enhancement and separation downstream. In this pull request, I

    • Add two new configs which have a much smaller model size and better performance
    • Made some small changes to the code, including (1) modifying the loss function, supporting L1 loss, and computing loss in log domain (for smaller input scale and more stable training) (2) removing the original postprocess function. Originally, I found there are some issues when I am using librosa.istft, and I am using the postprocess function to remove the impulse at the end of the signal. Now, I have found a better way to deal with this issue.
    opened by HuangZiliAndy 12
  • Is there no vq_apc local in s3prl?

    Is there no vq_apc local in s3prl?

    Hi, I pre-trained the vq_apc model for comparison, but when I tried to extract the feature representation of vq_apc, it failed.

    upstream=getattr(hub, 'vq_apc_local')('result/pretrain/vq_apc/states-epoch-50.ckpt')

    image

    Can you add vq_apc_local?

    opened by kaen2891 0
  • SID task loss function.

    SID task loss function.

    ASV and SID tasks are very similar and yet have different loss functions. ASV has AMsoftmax, and SID has softmax loss function, respectively.

    Why was this choice made? Furthermore, changing the loss function is acceptable or not?

    opened by raotnameh 1
  • Bump setuptools from 59.5.0 to 65.5.1 in /requirements

    Bump setuptools from 59.5.0 to 65.5.1 in /requirements

    Bumps setuptools from 59.5.0 to 65.5.1.

    Release notes

    Sourced from setuptools's releases.

    v65.5.1

    No release notes provided.

    v65.5.0

    No release notes provided.

    v65.4.1

    No release notes provided.

    v65.4.0

    No release notes provided.

    v65.3.0

    No release notes provided.

    v65.2.0

    No release notes provided.

    v65.1.1

    No release notes provided.

    v65.1.0

    No release notes provided.

    v65.0.2

    No release notes provided.

    v65.0.1

    No release notes provided.

    v65.0.0

    No release notes provided.

    v64.0.3

    No release notes provided.

    v64.0.2

    No release notes provided.

    v64.0.1

    No release notes provided.

    v64.0.0

    No release notes provided.

    v63.4.3

    No release notes provided.

    v63.4.2

    No release notes provided.

    ... (truncated)

    Changelog

    Sourced from setuptools's changelog.

    v65.5.1

    Misc ^^^^

    • #3638: Drop a test dependency on the mock package, always use :external+python:py:mod:unittest.mock -- by :user:hroncok
    • #3659: Fixed REDoS vector in package_index.

    v65.5.0

    Changes ^^^^^^^

    • #3624: Fixed editable install for multi-module/no-package src-layout projects.
    • #3626: Minor refactorings to support distutils using stdlib logging module.

    Documentation changes ^^^^^^^^^^^^^^^^^^^^^

    • #3419: Updated the example version numbers to be compliant with PEP-440 on the "Specifying Your Project’s Version" page of the user guide.

    Misc ^^^^

    • #3569: Improved information about conflicting entries in the current working directory and editable install (in documentation and as an informational warning).
    • #3576: Updated version of validate_pyproject.

    v65.4.1

    Misc ^^^^

    v65.4.0

    Changes ^^^^^^^

    v65.3.0

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • unspecifed upstream models

    unspecifed upstream models

    hello there are several unspecified upstream models in s3prl hub like: passt_base ssast_frame_base wav2vec2_base_s2st_en_librilight wav2vec2_conformer_large_s2st_en_librilight ,... can you provide an explanation for these models? is there a place for all the upstream models details?

    opened by marziye-A 0
  • ContentVec support

    ContentVec support

    opened by vectominist 0
Releases(v0.3.4)
Owner
s3prl
The Self-Supervised Speech Pre-training and Representation Learning Toolkit Development Team
s3prl
Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022
Integrated physics-based and ligand-based modeling.

ComBind ComBind integrates data-driven modeling and physics-based docking for improved binding pose prediction and binding affinity prediction. Given

Dror Lab 44 Oct 26, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
Method for facial emotion recognition compitition of Xunfei and Datawhale .

人脸情绪识别挑战赛-第3名-W03KFgNOc-源代码、模型以及说明文档 队名:W03KFgNOc 排名:3 正确率: 0.75564 队员:yyMoming,xkwang,RichardoMu。 比赛链接:人脸情绪识别挑战赛 文章地址:link emotion 该项目分别训练八个模型并生成csv文

6 Oct 17, 2022
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology Sharon Zhou, Eric Zelikman

Stanford Machine Learning Group 34 Nov 16, 2022
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
Code for ICLR2018 paper: Improving GAN Training via Binarized Representation Entropy (BRE) Regularization - Y. Cao · W Ding · Y.C. Lui · R. Huang

code for "Improving GAN Training via Binarized Representation Entropy (BRE) Regularization" (ICLR2018 paper) paper: https://arxiv.org/abs/1805.03644 G

21 Oct 12, 2020
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

Transfer learning approach to bicycle sharing systems station location planning using OpenStreetMap Companion repository to the paper accepted at the

Politechnika Wrocławska - repozytorium dla informatyków 4 Oct 24, 2022
Doods2 - API for detecting objects in images and video streams using Tensorflow

DOODS2 - Return of DOODS Dedicated Open Object Detection Service - Yes, it's a b

Zach 101 Jan 04, 2023
Real life contra a deep learning project built using mediapipe and openc

real-life-contra Description A python script that translates the body movement into in game control. Welcome to all new real life contra a deep learni

Programminghut 7 Jan 26, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation This repository contains the source code of our paper, ESPNet (acc

Sachin Mehta 515 Dec 13, 2022
A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

Description A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization co

AoxiangFan 9 Nov 10, 2022
A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Jun-Yan Zhu 27 Aug 08, 2022
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

Hendrik Schröter 292 Dec 25, 2022
Adaptable tools to make reinforcement learning and evolutionary computation algorithms.

Pearl The Parallel Evolutionary and Reinforcement Learning Library (Pearl) is a pytorch based package with the goal of being excellent for rapid proto

38 Jan 01, 2023
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Tensorpack is a neural network training interface based on TensorFlow. Features: It's Yet Another TF high-level API, with speed, and flexibility built

Tensorpack 6.2k Jan 01, 2023
for a paper about leveraging discourse markers for training new models

TSLM-DISCOURSE-MARKERS Scope This repository contains: (1) Code to extract discourse markers from wikipedia (TSA). (1) Code to extract significant dis

International Business Machines 6 Nov 02, 2022