Persian Kaldi profile for Rhasspy built from open speech data

Overview

Persian Kaldi Profile

A Rhasspy profile for Persian (fa).

Installation

Get started by first installing Vosk:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip
pip3 install --upgrade wheel setuptools

# Install Vosk
pip3 install vosk

Next, download the model and extract it:

wget 'https://github.com/rhasspy/fa_kaldi-rhasspy/releases/download/v1.0/vosk-model-small-fa-rhasspy-0.15.zip'
unzip vosk-model-small-fa-rhasspy-0.15.zip

Finally, run the transcribe.py Python program with the model and an audio file:

python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 welcome.wav

{"result": [{"conf": 1.0, "end": 0.48, "start": 0.06, "word": "خوش"}, {"conf": 1.0, "end": 1.11, "start": 0.48, "word": "آمدید"}], "text": "خوش آمدید"}

For each audio file given to transcribe.py, a line of JSON will be printed in the output with the transcription details.

You might also like...
Service for working with open data of the State Duma of the Russian Federation
Service for working with open data of the State Duma of the Russian Federation

Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh

Driving lessons made simpler. Custom scheduling API built with Python.
Driving lessons made simpler. Custom scheduling API built with Python.

NOTE This is a mirror of a GitLab repository. Dryvo Dryvo is a unique solution for the driving lessons industry. Our aim is to save the teacher’s time

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.
This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Three-different-method-for-list-reversion Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing met

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.
Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls
Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

guess-the-numbers Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Number guessing game

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls
Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

password-generator Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Password generator

Comments
  •  PySoundFile failed. Trying audioread instead.

    PySoundFile failed. Trying audioread instead.

    I just tried to run this command: python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 MyFile.mp3

    and got this error:

    /your/path/.venv/lib/python3.9/site-packages/librosa/util/decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
      return f(*args, **kwargs)  
    

    Thank you so much

    opened by GameO7er 1
  • ModuleNotFoundError: No module named 'librosa'

    ModuleNotFoundError: No module named 'librosa'

    I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

    Traceback (most recent call last):
      File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module>
        import librosa
    ModuleNotFoundError: No module named 'librosa'
    

    Thank you so much.

    opened by GameO7er 1
  • ModuleNotFoundError: No module named 'numpy'

    ModuleNotFoundError: No module named 'numpy'

    I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

    Traceback (most recent call last):
      File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module>
        import librosa
    ModuleNotFoundError: No module named 'numpy'
    

    Thank you so much.

    opened by GameO7er 1
  • Error using recipes

    Error using recipes

    Hello, Thanks for you great work for sharing this useful repo. I tried to use your recipes to train Persian data. In run.sh file, an error ocurred while adapting lm.arpa and creating G.fst:

    creating G.fst...
    arpa2fst -
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
    FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
    ERROR: FstHeader::Read: Bad FST header: standard input
    

    full run.sh output is:

    Runtime configuration is: nJobs 12, nDecodeJobs 12. If this is not what you want, edit cmd.sh
    Starting at stage 0, train_stage -10
    
    Prepare phoneme data for Kaldi
    
    utils/prepare_lang.sh data/local/dict <unk> data/local/lang data/lang
    Checking data/local/dict/silence_phones.txt ...
    --> reading data/local/dict/silence_phones.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/silence_phones.txt is OK
    
    Checking data/local/dict/optional_silence.txt ...
    --> reading data/local/dict/optional_silence.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/optional_silence.txt is OK
    
    Checking data/local/dict/nonsilence_phones.txt ...
    --> reading data/local/dict/nonsilence_phones.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/nonsilence_phones.txt is OK
    
    Checking disjoint: silence_phones.txt, nonsilence_phones.txt
    --> disjoint property is OK.
    
    Checking data/local/dict/lexicon.txt
    --> reading data/local/dict/lexicon.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/lexicon.txt is OK
    
    Checking data/local/dict/extra_questions.txt ...
    --> reading data/local/dict/extra_questions.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/extra_questions.txt is OK
    --> SUCCESS [validating dictionary directory data/local/dict]
    
    **Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
    fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
    prepare_lang.sh: validating output directory
    utils/validate_lang.pl data/lang
    Checking existence of separator file
    separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
    Checking data/lang/phones.txt ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/lang/phones.txt is OK
    
    Checking words.txt: #0 ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/lang/words.txt is OK
    
    Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
    --> silence.txt and nonsilence.txt are disjoint
    --> silence.txt and disambig.txt are disjoint
    --> disambig.txt and nonsilence.txt are disjoint
    --> disjoint property is OK
    
    Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
    --> found no unexplainable phones in phones.txt
    
    Checking data/lang/phones/context_indep.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 15 entry/entries in data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.{txt, int, csl} are OK
    
    Checking data/lang/phones/nonsilence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 116 entry/entries in data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.{txt, int, csl} are OK
    
    Checking data/lang/phones/silence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 15 entry/entries in data/lang/phones/silence.txt
    --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
    --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
    --> data/lang/phones/silence.{txt, int, csl} are OK
    
    Checking data/lang/phones/optional_silence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 1 entry/entries in data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.{txt, int, csl} are OK
    
    Checking data/lang/phones/disambig.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 14 entry/entries in data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.{txt, int, csl} are OK
    
    Checking data/lang/phones/roots.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 32 entry/entries in data/lang/phones/roots.txt
    --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
    --> data/lang/phones/roots.{txt, int} are OK
    
    Checking data/lang/phones/sets.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 32 entry/entries in data/lang/phones/sets.txt
    --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
    --> data/lang/phones/sets.{txt, int} are OK
    
    Checking data/lang/phones/extra_questions.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 11 entry/entries in data/lang/phones/extra_questions.txt
    --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
    --> data/lang/phones/extra_questions.{txt, int} are OK
    
    Checking data/lang/phones/word_boundary.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 131 entry/entries in data/lang/phones/word_boundary.txt
    --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
    --> data/lang/phones/word_boundary.{txt, int} are OK
    
    Checking optional_silence.txt ...
    --> reading data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.txt is OK
    
    Checking disambiguation symbols: #0 and #1
    --> data/lang/phones/disambig.txt has "#0" and "#1"
    --> data/lang/phones/disambig.txt is OK
    
    Checking topo ...
    
    Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
    --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
    --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
    --> data/lang/phones/word_boundary.txt is OK
    
    Checking word-level disambiguation symbols...
    --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
    Checking word_boundary.int and disambig.int
    --> generating a 35 word/subword sequence
    --> resulting phone sequence from L.fst corresponds to the word sequence
    --> L.fst is OK
    --> generating a 45 word/subword sequence
    --> resulting phone sequence from L_disambig.fst corresponds to the word sequence
    --> L_disambig.fst is OK
    
    Checking data/lang/oov.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 1 entry/entries in data/lang/oov.txt
    --> data/lang/oov.int corresponds to data/lang/oov.txt
    --> data/lang/oov.{txt, int} are OK
    
    --> data/lang/L.fst is olabel sorted
    --> data/lang/L_disambig.fst is olabel sorted
    --> SUCCESS [validating lang directory data/lang]
    
    adapt our LM for kaldi...
    
    
    creating G.fst...
    arpa2fst -
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
    FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
    ERROR: FstHeader::Read: Bad FST header: standard input
    
    make mfcc
    
    fix_data_dir.sh: kept all 12394 utterances.
    fix_data_dir.sh: old files are kept in data/train/.backup
    mkdir: cannot create directory 'data/train/wav.scp': File exists
    steps/make_mfcc.sh --cmd utils/run.pl --nj 12 data/train exp/make_mfcc_chain/train mfcc_chain
    utils/validate_data_dir.sh: Successfully validated data-directory data/train
    steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
    

    can you please help me fix this issue? thanks

    opened by MahdiEsrafili 0
Owner
Rhasspy
Offline voice assistant
Rhasspy
Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

Sea AI Lab 253 Jan 05, 2023
Generate Openbox Menus from a easy to write configuration file.

openbox-menu-generator Generate Openbox Menus from a easy to write configuration file. Example Configuration: ('#' indicate comments but not implement

3 Jul 14, 2022
Module to align code with thoughts of users and designers. Also magically handles navigation and permissions.

This readme will introduce you to Carteblanche and walk you through an example app, please refer to carteblanche-django-starter for the full example p

Eric Neuman 42 May 28, 2021
NExT-Ford-aula4 - NExT Ford aula4

Questão 1: vocês deveram fazer o passo a passo de como ficará as pilhas(Stack) e

Gerson 1 Jan 06, 2022
A inspector to be able to view and edit Qt style sheet while an application is running

Qt Style Sheet Inspector An inspector widget to view and modify the style sheet of a Qt app at runtime. Usage In order to use the inspector widget on

ESSS 46 Dec 10, 2022
TrainingBike - Code, models and schematics I've used to interface my stationary training bike with PC.

TrainingBike Code, models and schematics I've used to interface my stationary training bike with PC. You can find more information about the project i

1 Jan 01, 2022
An extensive password manager built using Python, multiple implementations. Something to meet everyone's taste.

An awesome open-sourced password manager! Explore the docs » View Demo · Report Bug · Request Feature 🐍 Python Password Manager 🔐 An extensive passw

Sam R 7 Sep 28, 2021
Fisherman is a free open source fishing bot written in python.

Fisherman is a free open source fishing bot written in python.

Pure | Cody 33 Jan 29, 2022
Calculadora-basica - Calculator with basic operators

Calculadora básica Calculadora com operadores básicos; O programa solicitará a d

Vitor Antoni 2 Apr 26, 2022
One Ansible Module for using LINE notify API to send notification. It can be required in the collection list.

Ansible Collection - hazel_shen.line_notify Documentation for the collection. ansible-galaxy collection install hazel_shen.line_notify --ignore-certs

Hazel Shen 4 Jul 19, 2021
New multi tool im making adding features currently

Emera Multi Tool New multi tool im making adding features currently Current List of Planned Features - Linkvertise Bypasser - Discord Auto Bump - Gith

Lamp 3 Dec 03, 2021
Fully cross-platform toolkit (and library!) for MachO+Obj-C editing/analysis

fully cross-platform toolkit (and library!) for MachO+Obj-C editing/analysis. Includes a cli kit, a curses GUI, ObjC header dumping, and much more.

cynder 301 Dec 28, 2022
Extremely unfinished animation toolset for Blender 3.

AbraTools Alpha IMPORTANT: Code is a mess. Be careful using it in production. Bug reports, feature requests and PRs are appreciated. Download AbraTool

Abra 15 Dec 17, 2022
Adversarial Robustness with Non-uniform Perturbations

Adversarial Robustness with Non-uniform Perturbations This repository hosts the code to replicate experiments of the paper Adversarial Robustness with

5 May 20, 2022
An extension module to make reaction based menus with disnake

disnake-ext-menus An experimental extension menu that makes working with reaction menus a bit easier. Installing python -m pip install -U disnake-ext-

1 Nov 25, 2021
Compress .dds file in ggpk to boost fps. This is a python rewrite of PoeTexureResizer.

PoeBooster Compress .dds file in ggpk to boost fps. This is a python rewrite of PoeTexureResizer. Setup Install ImageMagick-7.1.0. Download and unzip

3 Sep 30, 2022
dta Convert Dict To Attributes!

dta (Dict to Attributes) dta is very small dict (or json) to attributes converter. It is only have 1 files and applied to every python versions.

Rukchad Wongprayoon 0 Dec 31, 2021
This is a spamming selfbot that has custom spammed message and @everyone spam.

This is a spamming selfbot that has custom spammed message and @everyone spam.

astro1212 1 Jul 31, 2022
It's a repo for Cramer's rule, which is some math crap or something idk

It's a repo for Cramer's rule, which is some math crap or something idk (just a joke, it's not crap; don't take that seriously, math teachers)

Module64 0 Aug 31, 2022
Experimental proxy for dumping the unencrypted packet data from Brawl Stars (WIP)

Brawl Stars Proxy Experimental proxy for version 39.99 of Brawl Stars. It allows you to capture the packets being sent between the Brawl Stars client

4 Oct 29, 2021