This is a python script to navigate and extract the FSD50K dataset

Last update: Nov 23, 2021

Related tags

Data Analysis fsd50k_extractor

Overview

FSD50K navigator

This is a script I use to navigate the sound dataset from FSK50K. https://zenodo.org/record/4060432#.YYYJ9HtBxH4

To use this download the files from datasets. The categories is in the Ground truth file to be downloaded in the links above.

These have several script

find_audio.py - This is the script you want
convert.py - this only exist so that I can convert data for a Wio Terminal project. https://wiki.seeedstudio.com/Wio-Terminal-TinyML-EI-3/
sound_category.py - this is to show the category of a file

Usage

installation

pipenv install
Start the shell with pipenv shell then use the script as described below.

find_audio.py

Usage

Basic use - python find_audio.py categories
Show all label of a file - python find_audio.py categories -s
Exclude file with certain categories - python find_audio.py categories -x xcategories
Search Multiple file - python find_audio.py cat1,cat2,etc
Search Exclude Multiple file - python find_audio.py cat1,cat2,etc -x xcat1,xcat2
Help - python find_audio.py -h

convert.py

Usage is similar for find_audio.py except for -p, -i and -o

Basic use - python convert.py categories -p new_category_name -i source_audio_directory -o output_directory

sound_category.py

Basic use - python sound_category.py audio_name

Owner

sweemeng

sweemeng

GitHub Repository

Pyspark Spotify ETL

This is my first Data Engineering project, it extracts data from the user's recently played tracks using Spotify's API, transforms data and then loads it into Postgresql using SQLAlchemy engine. Data

16 Jun 09, 2022

Python script for transferring data between three drives in two separate stages

Waterlock Waterlock is a Python script meant for incrementally transferring data between three folder locations in two separate stages. It performs ha

13 Nov 10, 2021

Udacity-api-reporting-pipeline - Udacity api reporting pipeline

udacity-api-reporting-pipeline In this exercise, you'll use portions of each of

1 Feb 15, 2022

A Python package for the mathematical modeling of infectious diseases via compartmental models

A Python package for the mathematical modeling of infectious diseases via compartmental models. Originally designed for epidemiologists, epispot can be adapted for almost any type of modeling scenari

12 Dec 28, 2022

Generate lookml for views from dbt models

dbt2looker Use dbt2looker to generate Looker view files automatically from dbt models. Features Column descriptions synced to looker Dimension for eac

126 Dec 28, 2022

🌍 Create 3d-printable STLs from satellite elevation data 🌏

mapa 🌍 Create 3d-printable STLs from satellite elevation data Installation pip install mapa Usage mapa uses numpy and numba under the hood to crunch

13 Dec 15, 2022

Big Data & Cloud Computing for Oceanography

DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I

5 Mar 19, 2022

A real data analysis and modeling project - restaurant inspections

A real data analysis and modeling project - restaurant inspections Jafar Pourbemany 9/27/2021 This project represents data analysis and modeling of re

2 Aug 21, 2022

Full ELT process on GCP environment.

Rent Houses Germany - GCP Pipeline Project: The goal of the project is to extract data about house rentals in Germany, store, process and analyze it u

2 Jan 20, 2022

Fitting thermodynamic models with pycalphad

ESPEI ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for thermodynamic database development within the CALPHAD method

42 Sep 12, 2022

This is a repo documenting the best practices in PySpark.

Spark-Syntax This is a public repo documenting all of the "best practices" of writing PySpark code from what I have learnt from working with PySpark f

447 Dec 25, 2022

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis 📈 This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

1 Sep 03, 2022

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

2 Nov 20, 2021

Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy （Substance classification database） Download the database wget -c https://ftp.ncbi.nlm.ni

2 Jan 04, 2022

A data structure that extends pyspark.sql.DataFrame with metadata information.

MetaFrame A data structure that extends pyspark.sql.DataFrame with metadata info

8 Feb 15, 2022

Display the behaviour of a realtime program with a scope or logic analyser.

1. A monitor for realtime MicroPython code This library provides a means of examining the behaviour of a running system. It was initially designed to

17 Dec 05, 2022

Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

1 Feb 11, 2022

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

2 Jul 01, 2022

Titanic data analysis for python

Titanic-data-analysis This Repo is an analysis on Titanic_mod.csv This csv file contains some assumed data of the Titanic ship after sinking This full

1 Dec 26, 2021

Python library for creating data pipelines with chain functional programming

PyFunctional Features PyFunctional makes creating data pipelines easy by using chained functional operators. Here are a few examples of what it can do

2.1k Jan 05, 2023