STFT_Transformer

Code for STFT Transformer used in BirdCLEF 2021 competition.

The STFT Transformer is a new way to use Transformers similar to Vision Transformers on audio data. It has been developed for the BirdCLEF 2021 competition hosted on Kaggle. The pdf document gives more context. It has been submitted to the BIRDCLEF 2021 workshop.

The code is provided as is, it has not been rewritten. Given competitions are done in a hurry, code may not meet usual open source standard.

The code assumes this directory structure:

<base_dir>/code

<base_dir>/input

<base_dir>/input/freefield1010

<base_dir>/checkpoints

<base_dir>/data

Code has to be run in the code directory. Competition data has to be downloaded in the input directory. freefield1010 data must also be downloaded in the freefield1010 directory. data_final.py should be run first. It reads audio files from input and stores the relevant part in data directory as numpy files.

Then stft_transformer_final.py can be run to train one fold model. During the competition I ran 5 folds, by editing the FOLD global variable in the script (I know, this is sub standard).

Once all 5 models are trained one can upload the weights to a kaggle dataset and use the submission notebook I used. This should get a score worth the 15th rank in the competition. Achieving this rank with a single model is significant, as all top teams used an ensemble of models.

Code for STFT Transformer used in BirdCLEF 2021 competition.

Related tags

Overview

STFT_Transformer

Owner

Jean-François Puget

Author Disambiguation using Knowledge Graph Embeddings with Literals

Code for paper "Learning to Reweight Examples for Robust Deep Learning"

Codeflare - Scale complex AI/ML pipelines anywhere

An implementation of shampoo

Convex optimization for fun and profit.

Locationinfo - A script helps the user to show network information such as ip address

Space Invaders For Python

The repository contain code for building compiler using puthon.

[NeurIPS 2021] PyTorch Code for Accelerating Robotic Reinforcement Learning with Parameterized Action Primitives

Algebraic effect handlers in Python

Doge-Prediction - Coding Club prediction ig

[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Face recognition with trained classifiers for detecting objects using OpenCV

Deep learning model, heat map, data prepo

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Implementation of average- and worst-case robust flatness measures for adversarial training.

Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

PyTorch IPFS Dataset