kaldi-asr/kaldi is the official location of the Kaldi project.

Last update: Jan 05, 2023

Overview

Kaldi Speech Recognition Toolkit

To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux; Darwin; and Cygwin (has not been tested on more "exotic" varieties of UNIX). For Windows installation instructions (excluding Cygwin), see windows/INSTALL.

To run the example system builds, see egs/README.txt

If you encounter problems (and you probably will), please do not hesitate to contact the developers (see below). In addition to specific questions, please let us know if there are specific aspects of the project that you feel could be improved, that you find confusing, etc., and which missing features you most wish it had.

Kaldi information channels

For HOT news about Kaldi see the project site.

Documentation of Kaldi:

Info about the project, description of techniques, tutorial for C++ coding.
Doxygen reference of the C++ code.

Kaldi forums and mailing lists:

We have two different lists

User list kaldi-help
Developer list kaldi-developers:

To sign up to any of those mailing lists, go to http://kaldi-asr.org/forums.html:

Development pattern for contributors

Create a personal fork of the main Kaldi repository in GitHub.
Make your changes in a named branch different from master, e.g. you create a branch my-awesome-feature.
Generate a pull request through the Web interface of GitHub.
As a general rule, please follow Google C++ Style Guide. There are a few exceptions in Kaldi. You can use the Google's cpplint.py to verify that your code is free of basic mistakes.

Platform specific notes

PowerPC 64bits little-endian (ppc64le)

Kaldi is expected to work out of the box in RHEL >= 7 and Ubuntu >= 16.04 with OpenBLAS, ATLAS, or CUDA.
CUDA drivers for ppc64le can be found at https://developer.nvidia.com/cuda-downloads.
An IBM Redbook is available as a guide to install and configure CUDA.

Android

Kaldi supports cross compiling for Android using Android NDK, clang++ and OpenBLAS.
See this blog post for details.

kaldi-asr/kaldi is the official location of the Kaldi project.

Related tags

Overview

Kaldi Speech Recognition Toolkit

Kaldi information channels

Development pattern for contributors

Platform specific notes

PowerPC 64bits little-endian (ppc64le)

Android

Owner

Kaldi

⛓ marc is a small, but flexible Markov chain generator

CNN+Attention+Seq2Seq

Generic framework for historical document processing

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.

This is a real life mario project using python and mediapipe

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

Course material for the Multi-agents and computer graphics course

Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

OpenGait is a flexible and extensible gait recognition project

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Text-to-Image generation

Amazing 3D explosion animation using Pygame module.

Msos searcher - A half-hearted attempt at finding a magic square of squares

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

ocroseg - This is a deep learning model for page layout analysis / segmentation.

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Histogram specification using openCV in python .