FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Last update: Sep 06, 2022

Overview

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.

Authors

Andrew Wang, University of Cambridge, Cambridge, UK Pierre Houdouin, CentraleSupélec, Paris, France

Instllation

pip install -i https://test.pypi.org/simple/ femda

Get started

>>> from sklearn.datasets import load_iris
>>> from femda import FEMDA
>>> X, y = load_iris(return_X_y=True)
>>> clf = FEMDA()
>>> clf.fit(X, y)
FEMDA()
>>> clf.score(X, y)
0.9666666666666667

Using a specific dataset...

>> FEMDA().fit(X_train, y_train).score(X_test, y_test) ...">

>>> import femda.experiments.preprocessing as pre
>>> X_train, y_train, X_test, y_test = pre.statlog(r"root\datasets\\")
>>> FEMDA().fit(X_train, y_train).score(X_test, y_test)
...

Using a sklearn.pipeline.Pipeline...

>>> from sklearn.datasets import load_digits
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.decomposition import PCA
>>> X, y = load_digits(return_X_y=True)
>>> pipe = make_pipeline(PCA(n_components=5), FEMDA()).fit(X, y)
>>> pipe.predict(X)
...

Run all experiments presented in the paper

>>> from femda.experiments import run_experiments()
>>> run_experiments()
...

See for more.

Abstract

Linear and Quadraic Discriminant Analysis are well-known classical methods but suffer heavily from non-Gaussian class distributions and are very non-robust in contaminated datasets. In this paper, we present a new discriminant analysis style classification algorithm that directly models noise and diverse shapes which can deal with a wide range of datasets.

Each data point is modelled by its own arbitrary Elliptically Symmetrical (ES) distribution and its own arbitrary scale parameter, modelling directly very heterogeneous, non-i.i.d datasets. We show that maximum-likelihood parameter estimation and classification are simple and fast under this model.

We highlight the flexibility of the model to a wide range of Elliptically Symmetrical distribution shapes and varying levels of contamination in synthetic datasets. Then, we show that our algorithm outperforms other robust methods on contaminated datasets from Computer Vision and NLP.

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Related tags

Overview

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Authors

Instllation

Get started

Run all experiments presented in the paper

Abstract

Owner

HNN: Human (Hollywood) Neural Network

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

This is the code for the paper "Contrastive Clustering" (AAAI 2021)

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Code accompanying the paper Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs (Chen et al., CVPR 2020, Oral).

Fast and exact ILP-based solvers for the Minimum Flow Decomposition (MFD) problem, and variants of it.

Code for paper: Towards Tokenized Human Dynamics Representation

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun"

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

High frequency AI based algorithmic trading module.

Histocartography is a framework bringing together AI and Digital Pathology

Yolov5 + Deep Sort with PyTorch

U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera