Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Overview

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

My MSc Computer Science research project, titled 'Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning'. The findings and thesis conducted from this research can be found here.

Audio side-channel attacks are increasingly becoming a security concern regarding ‘keystroke snooping’, in which an attack can utilise the emanation of a keystroke to predict a specific key (or contextual passage of keys) being pressed. This can potentially be used to gather a users’ private data if keystroke audio is able to be discretely captured.

In this project, Python code has been created to analayse the acoustic emanation and geometric features of a keystroke signal, and this information is used to provide enough information to accurately classify keystroke emanations. A combination of MFCC and TDoA features are shown to provide superior classification results when compared to other input features.

A novel attack is presented which utilises cross-prediction techniques on a stereo array of microphones to increase keystroke recognition accuracy. Cross-predictions increase singular character recovery of keystrokes by 7% when using a supervised Random Forest machine learning model. A Random Forest classifier is able to achieve up to 89% inter-dataset single-character recovery from a 40-key classification problem.

User experiments are also conducted to show the model in real-world scenarios. In the experiments, up to 85% keystroke recovery from contextual arguments were achieved from a 26-key classification problem using a Random Forest classifier. Keystroke recovery can increase by as much as 15% when utilsiing cross-prediction methods on contextual sentences. Contextual arguments were best predicted when using a user-created database of keystroke emanations.

It is shown in this research that different users emit distinct sonic fingerprints when typing on the same keyboard. Provided that a database of labelled keystrokes can be collected from a user, a supervised attack remains feasible in real-world scenarios.

The Fuzzy Labs guide to the universe of open source MLOps

Open Source MLOps This is the Fuzzy Labs guide to the universe of free and open source MLOps tools. Contents What is MLOps, anyway? Data version contr

Fuzzy Labs 352 Dec 29, 2022
A collection of machine learning examples and tutorials.

machine_learning_examples A collection of machine learning examples and tutorials.

LazyProgrammer.me 7.1k Jan 01, 2023
Programming assignments and quizzes from all courses within the Machine Learning Engineering for Production (MLOps) specialization offered by deeplearning.ai

Machine Learning Engineering for Production (MLOps) Specialization on Coursera (offered by deeplearning.ai) Programming assignments from all courses i

Aman Chadha 173 Jan 05, 2023
Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining

**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.** S

Sebastian Raschka 4k Dec 30, 2022
Free MLOps course from DataTalks.Club

MLOps Zoomcamp Our MLOps Zoomcamp course Sign up here: https://airtable.com/shrCb8y6eTbPKwSTL (it's not automated, you will not receive an email immed

DataTalksClub 4.6k Dec 31, 2022
Given the names and grades for each student in a class N of students, store them in a nested list and print the name(s) of any student(s) having the second lowest grade.

Hackerank-Nested-List Given the names and grades for each student in a class N of students, store them in a nested list and print the name(s) of any s

Sangeeth Mathew John 2 Dec 14, 2021
Built on python (Mathematical straight fit line coordinates error predictor machine learning foundational model)

Sum-Square_Error-Business-Analytical-Tool- Built on python (Mathematical straight fit line coordinates error predictor machine learning foundational m

om Podey 1 Dec 03, 2021
Machine learning that just works, for effortless production applications

Machine learning that just works, for effortless production applications

Elisha Yadgaran 16 Sep 02, 2022
Distributed deep learning on Hadoop and Spark clusters.

Note: we're lovingly marking this project as Archived since we're no longer supporting it. You are welcome to read the code and fork your own version

Yahoo 1.3k Dec 28, 2022
Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑‍🔧

Deliver ML products, better & faster Giskard is an Open-Source CI/CD platform for ML teams. Inspect ML models visually from your Python notebook 📗 Re

Giskard 335 Jan 04, 2023
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
Python package for causal inference using Bayesian structural time-series models.

Python Causal Impact Causal inference using Bayesian structural time-series models. This package aims at defining a python equivalent of the R CausalI

Thomas Cassou 219 Dec 11, 2022
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile

matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo

Target 696 Dec 26, 2022
High performance Python GLMs with all the features!

High performance Python GLMs with all the features!

QuantCo 200 Dec 14, 2022
ML-powered Loan-Marketer Customer Filtering Engine

In Loan-Marketing business employees are required to call the user's to buy loans of several fields and in several magnitudes. If employees are calling everybody in the network it is also very length

Sagnik Roy 13 Jul 02, 2022
moDel Agnostic Language for Exploration and eXplanation

moDel Agnostic Language for Exploration and eXplanation Overview Unverified black box model is the path to the failure. Opaqueness leads to distrust.

Model Oriented 1.2k Jan 04, 2023
Predict the output which should give a fair idea about the chances of admission for a student for a particular university

Predict the output which should give a fair idea about the chances of admission for a student for a particular university.

ArvindSandhu 1 Jan 11, 2022
Book Item Based Collaborative Filtering

Book-Item-Based-Collaborative-Filtering Collaborative filtering methods are used

Şebnem 3 Jan 06, 2022
Tangram makes it easy for programmers to train, deploy, and monitor machine learning models.

Tangram Website | Discord Tangram makes it easy for programmers to train, deploy, and monitor machine learning models. Run tangram train to train a mo

Tangram 1.4k Jan 05, 2023
Continuously evaluated, functional, incremental, time-series forecasting

timemachines Autonomous, univariate, k-step ahead time-series forecasting functions assigned Elo ratings You can: Use some of the functionality of a s

Peter Cotton 343 Jan 04, 2023