Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Last update: Nov 17, 2022

Overview

Building Shazam from scratch

In this repository we tried to implement a simplified copy of the Shazam application able to tell you the name of a song listening to a short sample.

Overview

Converting the songs from mp3 to wav with Librosa and extraction of the peaks
MinHashing with permutations on the shingles matrix
Locality sensitive hashing to divide the songs in buckets
Shazam!

pickle is a folder that contains the songs peaks, the shingles array and the shingle matrix in pickle format.
ShazamLSH.ipynb is the main notebook that only contains the explanation of the steps and some comments
function.py contains all the implemented function needed to execute the notebook

Resources

This is the dataset we used and processed:

https://www.kaggle.com/dhrumil140396/mp3s32k

We also share some useful links can help to understand what is the process behind Min Hashing and LSH in order to recognise song:

Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Related tags

Overview

Building Shazam from scratch

Overview

Contents

Resources

Owner

Arturo Ghinassi

CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

Denoising images with Fourier Ring Correlation loss

An investigation project for SISR.

Predicting Event Memorability from Contextual Visual Semantics

ElasticFace: Elastic Margin Loss for Deep Face Recognition

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Embracing Single Stride 3D Object Detector with Sparse Transformer

Large-scale Hyperspectral Image Clustering Using Contrastive Learning, CIKM 21 Workshop

Learning Open-World Object Proposals without Learning to Classify

Deep Halftoning with Reversible Binary Pattern

ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

Research on Event Accumulator Settings for Event-Based SLAM

General Virtual Sketching Framework for Vector Line Art (SIGGRAPH 2021)

A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API

PyTorch implementations of algorithms for density estimation

Official code repository for Continual Learning In Environments With Polynomial Mixing Times

Anderson Acceleration for Deep Learning

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

StarGAN-ZSVC: Unofficial PyTorch Implementation