The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Last update: Jan 03, 2023

Overview

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory.

Additionally, contributions at the algorithm level are available in the package mlresearch.

Installation

A Python distribution of version 3.8 or 3.9 is required to run this project. Due to the computational limitations of the free tiers in CI/CD platforms, currently we cannot ensure compatibility with earlier Python versions.

ML-Research requires:

numpy (>= 1.14.6)
pandas (>= 1.3.5)
sklearn (>= 1.0.0)
imblearn (>= 0.8.0)
rich (>= 10.16.1)
matplotlib (>= 2.2.3)
seaborn (>= 0.9.0)
rlearn (>= 0.2.1)
pytorch (>= 1.10.1)
torchvision (>= 0.11.2)
pytorch_lightning (>= 1.5.8)

User Installation

If you already have a working installation of numpy and scipy, the easiest way to install scikit-learn is using pip :

pip install -U ml-research

The documentation includes more detailed installation instructions.

Installing from source

The following commands should allow you to setup the development version of the project with minimal effort:

# Clone the project.
git clone https://github.com/joaopfonseca/ml-research.git
cd ml-research

# Create and activate an environment 
make environment 
conda activate mlresearch # Adapt this line accordingly if you're not running conda

# Install project requirements and the research package
pip install .[tests,docs]

Citing ML-Research

If you use ML-Research in a scientific publication, we would appreciate citations to the following paper:

@article{Fonseca2021,
  doi = {10.3390/RS13132619},
  url = {https://doi.org/10.3390/RS13132619},
  keywords = {SMOTE,active learning,artificial data generation,land use/land cover classification,oversampling},
  year = {2021},
  month = {jul},
  publisher = {Multidisciplinary Digital Publishing Institute},
  volume = {13},
  pages = {2619},
  author = {Fonseca, Joao and Douzas, Georgios and Bacao, Fernando},
  title = {{Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification}},
  journal = {Remote Sensing}
}

You might also like...

A collection of 100 Deep Learning images and visualizations

A collection of Deep Learning images and visualizations. The project has been developed by the AI Summer team and currently contains almost 100 images.

65 Sep 12, 2022

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

ManimML ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

259 Jan 4, 2023

Easily pull telemetry data and create beautiful visualizations for analysis.

This repository is a work in progress. Anything and everything is subject to change. Porpo Table of Contents Porpo Table of Contents General Informati

33 Nov 30, 2022

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

mtomo Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation.

24 Mar 2, 2022

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

pyrelational is a python active learning library developed by Relation Therapeutics for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximation), to creating novel active learning strategies.

95 Dec 27, 2022

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

32 Dec 25, 2022

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

The dual numbers can do efficient autodiff! The codual numbers are a simple meth

2 Dec 19, 2022

python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. With this module, all functionality exposed through the C++ interface is also available to Python scripts. Being able to access the API from Python greatly facilitates prototyping TiMBL-based applications.

README: python-timbl Authors: Sander Canisius, Maarten van Gompel Contact: [email protected] Web site: https://github.com/proycon/python-timbl/ pytho

16 Jan 16, 2022

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

197 Nov 29, 2022

Comments

Consider modifying default BYOL hyper-parameters for smaller batch sizes

Applicable to both BYOL and SimSiam: Some hyperparameters might need to be added. Some are hard-coded to the default values.

Taken from the BYOL paper:

opened by joaopfonseca 1
Remove computer vision models, augmentations and datasets
They will be removed in the next release since:

I'm not going to used these methods anytime soon and I don't have the time to test them properly

They are out of scope of the library. It is meant to be used for machine learning techniques, focused on tabular data. In the feature it may be worth considering the development of another library for computer vision, for example.

Setting Pytorch as a dependency for a reduced part of the library isn't particularly efficient.

wontfix
opened by joaopfonseca 0
Host all raw data from datasets submodule elsewhere

With Python 3.11, downloading some datasets returns an SSL error (when unsafe legacy renegotiation disabled). It happens when the server doesn't support "RFC 5746 secure renegotiation" and the client is using OpenSSL 3, which enforces that standard by default (source).

Hosting the raw data elsewhere should fix this issue.
bug

opened by joaopfonseca 0
Review and add examples to documentation
The readthedocs page is getting a bit outdated:

[x] Add support for Python 3.10

[ ] Add support for Python 3.11

[ ] Check for missing, deleted or renamed functions and objects

[ ] Review content as a whole

[ ] Add examples to documentation

[ ] Add dependency groups to documentation

[ ] README contains dependencies that will no longer be used

documentation
opened by joaopfonseca 0

Releases(v0.4a2)

v0.4a2(Jan 2, 2023)
NOTE: This pre-release contains implementations of algorithms for Self-supervised learning (BYOL and SimSiam). This release also contains objects to download image data from Pytorch and general definitions for image augmentations. They will be removed in the next release since:

I'm not going to used these methods anytime soon and I don't have the time to test them properly

They are out of scope of the library. It is meant to be used for machine learning techniques, focused on tabular data. In the feature it may be worth considering the development of another library for computer vision, for example.

Setting Pytorch as a dependency for a reduced part of the library isn't particularly efficient.

Full Changelog: https://github.com/joaopfonseca/ml-research/compare/v0.4a1...v0.4a2
Source code(tar.gz)
Source code(zip)
v0.4a1(Apr 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research/compare/0.1.0...v0.4a1
Source code(tar.gz)
Source code(zip)
v0.3.4(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/v0.3.3...v0.3.4
Source code(tar.gz)
Source code(zip)
v0.3.3(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/v0.3.2...v0.3.3
Source code(tar.gz)
Source code(zip)
v0.3.2(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/v0.3.1...v0.3.2
Source code(tar.gz)
Source code(zip)
v0.3.1(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/v0.3.0...v0.3.1
Source code(tar.gz)
Source code(zip)
v0.3.0(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/v0.2.1...v0.3.0
Source code(tar.gz)
Source code(zip)
v0.2.1(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/v0.2.0...v0.2.1
Source code(tar.gz)
Source code(zip)
v0.2.0(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/compare/0.1.0...v0.2.0
Source code(tar.gz)
Source code(zip)
0.1.0(Feb 14, 2022)

Full Changelog: https://github.com/joaopfonseca/ml-research-backup/commits/0.1.0
Source code(tar.gz)
Source code(zip)

Owner

João Fonseca

PhD student | Researcher | Invited lecturer @ NOVA Information Management School

GitHub Repository

PyKaldi GOP-DNN on Epa-DB

PyKaldi GOP-DNN on Epa-DB This repository has the tools to run a PyKaldi GOP-DNN algorithm on Epa-DB, a database of non-native English speech by Spani

18 Dec 14, 2022

Computational Pathology Toolbox developed by TIA Centre, University of Warwick.

TIA Toolbox Computational Pathology Toolbox developed at the TIA Centre Getting Started All Users This package is for those interested in digital path

156 Jan 08, 2023

Clockwork Variational Autoencoder

Clockwork Variational Autoencoders (CW-VAE) Vaibhav Saxena, Jimmy Ba, Danijar Hafner If you find this code useful, please reference in your paper: @ar

35 Nov 06, 2022

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Perceiver This Python package implements Perceiver: General Perception with Iterative Attention by Andrew Jaegle in TensorFlow. This model builds on t

84 Oct 15, 2022

Intrinsic Image Harmonization

Intrinsic Image Harmonization [Paper] Zonghui Guo, Haiyong Zheng, Yufeng Jiang, Zhaorui Gu, Bing Zheng Here we provide PyTorch implementation and the

44 Dec 21, 2022

The mini-MusicNet dataset

mini-MusicNet A music-domain dataset for multi-label classification Music transcription is sequence-to-sequence prediction problem: given an audio per

4 Nov 09, 2022

CLIP (Contrastive Language–Image Pre-training) for Italian

Italian CLIP CLIP (Radford et al., 2021) is a multimodal model that can learn to represent images and text jointly in the same space. In this project,

114 Dec 29, 2022

PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

102 Nov 03, 2022

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 01, 2023

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Related tags

Overview

Installation

User Installation

Installing from source

Citing ML-Research

You might also like...

A collection of 100 Deep Learning images and visualizations

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

Easily pull telemetry data and create beautiful visualizations for analysis.

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Comments

Consider modifying default BYOL hyper-parameters for smaller batch sizes

Remove computer vision models, augmentations and datasets

Host all raw data from datasets submodule elsewhere

Review and add examples to documentation

Releases(v0.4a2)

v0.4a2(Jan 2, 2023)

v0.4a1(Apr 14, 2022)

v0.3.4(Feb 14, 2022)

v0.3.3(Feb 14, 2022)

v0.3.2(Feb 14, 2022)

v0.3.1(Feb 14, 2022)

v0.3.0(Feb 14, 2022)

v0.2.1(Feb 14, 2022)

v0.2.0(Feb 14, 2022)

0.1.0(Feb 14, 2022)

Owner

João Fonseca

PyKaldi GOP-DNN on Epa-DB

Computational Pathology Toolbox developed by TIA Centre, University of Warwick.

Clockwork Variational Autoencoder

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Intrinsic Image Harmonization

The mini-MusicNet dataset

CLIP (Contrastive Language–Image Pre-training) for Italian

PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

A fast Protein Chain / Ligand Extractor and organizer.

Multi-Task Learning as a Bargaining Game

Position detection system of mobile robot in the warehouse enviroment

PyTorch implementation for OCT-GAN Neural ODE-based Conditional Tabular GANs (WWW 2021)

Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge

Extreme Rotation Estimation using Dense Correlation Volumes

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor

PyTorch implementation of Self-supervised Contrastive Regularization for DG (SelfReg)

Robust Partial Matching for Person Search in the Wild