BanditPAM: Almost Linear-Time k-Medoids Clustering

Last update: Dec 12, 2022

Related tags

Overview

BanditPAM: Almost Linear-Time k-Medoids Clustering

This repo contains a high-performance implementation of BanditPAM from BanditPAM: Almost Linear-Time k-Medoids Clustering. The code can be called directly from Python or C++.

If you use this software, please cite:

Mo Tiwari, Martin Jinye Zhang, James Mayclin, Sebastian Thrun, Chris Piech, Ilan Shomorony. "BanditPAM: Almost Linear Time k-medoids Clustering via Multi-Armed Bandits" Advances in Neural Information Processing Systems (NeurIPS) 2020.

@inproceedings{BanditPAM,
  title={BanditPAM: Almost Linear Time k-medoids Clustering via Multi-Armed Bandits},
  author={Tiwari, Mo and Zhang, Martin J and Mayclin, James and Thrun, Sebastian and Piech, Chris and Shomorony, Ilan},
  booktitle={Advances in Neural Information Processing Systems},
  pages={368--374},
  year={2020}
}

Requirements

TL;DR run `pip3 install banditpam` and jump to the examples.

If you have any issues, please see the documents below and file a Github issue if you have additional trouble.

Python Quickstart

Install the repo and its dependencies:

This can be done either through PyPI (recommended)

/BanditPAM/: pip install -r requirements.txt
/BanditPAM/: pip install banditpam

OR through the source code via

/BanditPAM/: git submodule update --init --recursive
/BanditPAM/: cd headers/carma
/BanditPAM/: mkdir build && cd build && cmake .. && make && sudo make install
/BanditPAM/: cd ../../..
/BanditPAM/: pip install -r requirements.txt
/BanditPAM/: sudo pip install .

Example 1: Synthetic data from a Gaussian Mixture Model

from banditpam import KMedoids
import numpy as np
import matplotlib.pyplot as plt

# Generate data from a Gaussian Mixture Model with the given means:
np.random.seed(0)
n_per_cluster = 40
means = np.array([[0,0], [-5,5], [5,5]])
X = np.vstack([np.random.randn(n_per_cluster, 2) + mu for mu in means])

# Fit the data with BanditPAM:
kmed = KMedoids(n_medoids = 3, algorithm = "BanditPAM")
# Writes results to gmm_log
kmed.fit(X, 'L2', "gmm_log")

# Visualize the data and the medoids:
for p_idx, point in enumerate(X):
    if p_idx in map(int, kmed.medoids):
        plt.scatter(X[p_idx, 0], X[p_idx, 1], color='red', s = 40)
    else:
        plt.scatter(X[p_idx, 0], X[p_idx, 1], color='blue', s = 10)

plt.show()

Example 2: MNIST and its medoids visualized via t-SNE

# Start in the repository root directory, i.e. '/BanditPAM/'.
from banditpam import KMedoids
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.manifold import TSNE

# Load the 1000-point subset of MNIST and calculate its t-SNE embeddings for visualization:
X = pd.read_csv('data/MNIST-1k.csv', sep=' ', header=None).to_numpy()
X_tsne = TSNE(n_components = 2).fit_transform(X)

# Fit the data with BanditPAM:
kmed = KMedoids(n_medoids = 10, algorithm = "BanditPAM")
kmed.fit(X, 'L2', "mnist_log")

# Visualize the data and the medoids via t-SNE:
for p_idx, point in enumerate(X):
    if p_idx in map(int, kmed.medoids):
        plt.scatter(X_tsne[p_idx, 0], X_tsne[p_idx, 1], color='red', s = 40)
    else:
        plt.scatter(X_tsne[p_idx, 0], X_tsne[p_idx, 1], color='blue', s = 5)

plt.show()

The corresponding logfile for this run, mnist_log, will contain the run's results and additional statistics in a format that can be easily read into json.

Documentation

Documentation for BanditPAM can be found here:

Doxygen docs: BanditPAM/docs/html/index.html

Building the C++ executable from source

Please note that it is NOT necessary to build the C++ executable from source to use the Python code above. However, if you would like to use the C++ executable directly, follow the instructions below.

Option 1: Building with Docker

We highly recommend building using Docker. One can download and install Docker by following instructions at the Docker install page. Once you have Docker installed and the Docker Daemon is running, run the following commands:

/BanditPAM$ chmod +x env_setup.sh
/BanditPAM$ ./env_setup.sh
/BanditPAM$ ./run_docker.sh

which will start a Docker instance with the necessary dependencies. Then:

/BanditPAM$ mkdir build && cd build
/BanditPAM/build$ cmake .. && make

This will create an executable named BanditPAM in BanditPAM/build/src.

Option 2: Installing Requirements and Building Directly

Building this repository requires four external requirements:

Cmake >= 3.17, https://cmake.org/download/
Armadillo >= 10.5.3, http://arma.sourceforge.net/download.html
OpenMP >= 2.5, https://www.openmp.org/resources/openmp-compilers-tools/ (OpenMP is supported by default on most Linux platforms, and can be downloaded through homebrew on MacOS. For instructions on installing homebrew, see https://brew.sh/.)
CARMA >= 0.3.0, https://github.com/RUrlus/carma (use the commit to which BanditPAM points in the /headers subdirectory)

If installing these requirements from source, one can generally use the following procedure to install each requirement from the library's root folder (with CARMA used as an example here):

/BanditPAM$ cd headers/carma
/BanditPAM/headers/carma$ mkdir build && cd build
/BanditPAM/headers/carma/build$ cmake .. && make && sudo make install

Further installation information for MacOS, Linux, and Windows is available in the docs folder. Ensure all the requirements above are installed and then run:

/BanditPAM$ mkdir build && cd build
/BanditPAM/build$ cmake .. && make

This will create an executable named BanditPAM in BanditPAM/build/src.

C++ Usage

Once the executable has been built, it can be invoked with:

/BanditPAM/build/src/BanditPAM -f [path/to/input.csv] -k [number of clusters] -v [verbosity level]

-f is mandatory and specifies the path to the dataset
-k is mandatory and specifies the number of clusters with which to fit the data
-v is optional and specifies the verbosity level.

For example, if you ran ./env_setup.sh and downloaded the MNIST dataset, you could run:

/BanditPAM/build/src/BanditPAM -f ../data/MNIST-1k.csv -k 10 -v 1

The expected output in the command line will be:

Medoids: 694,168,306,714,324,959,527,251,800,737

A file called KMedoidsLogfile with detailed logs during the process will also be present.

Implementing a custom distance metric

One of the advantages of k-medoids is that it works with arbitrary distance metrics; in fact, your "metric" need not even be a real metric -- it can be negative, asymmetric, and/or not satisfy the triangle inequality or homogeneity. Any pairwise dissimilarity function works with k-medoids!

This also allows for clustering of "exotic" objects like trees, graphs, natural language, and more -- settings where running k-means wouldn't even make sense. We talk about one such setting in the original paper.

The package currently supports a number of distance metrics, including all Lp losses and cosine distance.

If you're willing to write a little C++, you only need to add a few lines to kmedoids_algorithm.cpp and kmedoids_algorithm.hpp to implement your distance metric / pairwise dissimilarity!

Then, be sure to re-install the repository with a pip install . (note the trailing .).

The maintainers of this repository are working on permitting arbitrary dissimilarity metrics that users write in Python, as well; see #4.

Testing

To run the full suite of tests, run in the root directory:

/BanditPAM$ python -m unittest discover -s tests

Alternatively, to run a "smaller" set of tests, from the main repo folder run python tests/test_commit.py or python tests/test_push.py to run a set of longer, more intensive tests.

Reproducing Figures from the Paper

Note that some figures in the original paper were generated using the Python code at https://github.com/motiwari/BanditPAM-python. That code is not pretty, nor is it maintained. It only exists for reference and for reproducibility of the plots.

Credits

Mo Tiwari wrote the original Python implementation of BanditPAM and many features of the C++ implementation. Mo now maintains the C++ implementation.

James Mayclin developed the initial C++ implementation of BanditPAM.

The original BanditPAM paper was published by Mo Tiwari, Martin Jinye Zhang, James Mayclin, Sebastian Thrun, Chris Piech, and Ilan Shomorony.

We would like to thank Jerry Quinn, David Durst, Geet Sethi, and Max Horton for helpful guidance regarding the C++ implementation.

Comments

Error during installation of BanditPAM - Unsupported compiler -- at least C++11 support is needed!

I ran the command pip3 install banditpam, but ran into the following error:

File "/private/var/folders/k2/_w3zmb555fj_k1x7mtg_q6400000gn/T/pip-install-is81nqkv/banditpam_04dfedfe2ee2481a932e35a91a31d0fe/setup.py", line 239, in build_extensions opts.append(cpp_flag(self.compiler)) File "/private/var/folders/k2/_w3zmb555fj_k1x7mtg_q6400000gn/T/pip-install-is81nqkv/banditpam_04dfedfe2ee2481a932e35a91a31d0fe/setup.py", line 88, in cpp_flag raise RuntimeError("Unsupported compiler -- at least C++11 support is needed!") RuntimeError: Unsupported compiler -- at least C++11 support is needed!

Any thoughts on how to proceed? Tried following some of the suggestions from here but with no luck.

opened by sterlingalic 10
Issue with loading data from numpy array?
Hi,

as part of my thesis I have been implementing the algorithms from this repo from scratch, and I found a discrepancy in the results my code produced when compared to the reference (this repo's Python wrapper).

I debugged the issue for a bit and found that the data matrix doesn't match the passed numpy array. This can be seen by printing data.col(0) after the transpose on line 262. We would expect it to return the same point as X[0] (where X is the numpy array), but instead we get something else. Weirdly enough, the y-coordinate matches, but the x-coordinate is different (and does not appear anywhere in the input data).

To reproduce, insert the following after line 262:

printf("[C++ ] :: X[0] = [%f %f]\n", data.col(0)[0], data.col(0)[1]);

then recompile the BanditPAM dependency and run the following Python script:

import numpy as np from math import dist from BanditPAM import KMedoids np.random.seed(0) means = np.array([[0,0], [-5,5], [5,5]]) X = np.vstack([np.random.randn(2**7, 2) + µ for µ in means]) kmed = KMedoids(n_medoids=3, algorithm="naive", verbosity=0) kmed.fit(X, "L2", 3, "") print(f"[Python] :: X[0] = {X[0]}")

This gives the following output:

[C++ ] :: X[0] = [0.000000 0.400157] [Python] :: X[0] = [1.76405235 0.40015721]

We would expect them to be equal, but they are not. I have not dug deep enough to figure out why this is, though I suspect it is an issue with the C++/Python interfacing libraries. Also possible is that my dependencies are somehow messed up, though that seems unlikely as the issue persists in the BanditPAM version from PyPI, as confirmed by the resulting medoids being identical.
opened by DarioSucic 6
Naive is not PAM

https://github.com/ThrunGroup/BanditPAM/blob/3567dd2d49aadc4744710fb74069b1dda5a93730/src/kmedoids_ucb.cpp#L420-L434

The current code is O(N²k²) whereas the original PAM is only O(N²k) by computing the change in loss instead of recomputing the entire loss every time. The change can be computed efficiently when caching the distance to the nearest as well as second nearest medoid. Only for the 'winning' solution the second nearest needs to be updated at the end.
bug

opened by kno10 5
Cannot install BanditPAM in Paperspace Gradient

Hi!

I cannot seem to install BanditPAM in the Jupyterlab in a Paperspace Gradient compute instance. I have attached a .txt file with this issue since the error message is way too large.

Thanks in advance for looking into it. Also thanks for the amazing project you are curating! banditpam_error.txt

opened by tanweer-mahdi 4
remove redundant medoids number k in Python fit method. Fixes #48

The argument k in Python fit method is redundant since we have already specified the medoids number when we instantiate the KMedoids class. We therefore remove this argument in fitPython() method in KMedsWrapper class. The Python examples in repo are tested again using the updated code and the results are as follow:

The standard cases ./BanditPAM -f ../../data/MNIST-1k.csv -k 10 -v 1 with different loss functions manhattan, cos, inf, L15 are also tested and they all give the expected results.

opened by mailology 4
cosine similarity vs cosine distance

Wanted to say thanks for the repo - this is a giant leap forward in scaling k-medoids.

I noticed that one of the distance metrics available is cosine similarity - not cosine distance which is 1 - cosine similarity. My intuition tells me dist(me, me) should be zero, and not one.

If using cosine similarity is intentional, I could put in a PR for cos_dist.

opened by tazitoo 3

pip installation on Google colab

pip install on Google colab does not work.

!pip install banditpam
Collecting banditpam
  Using cached banditpam-1.0.2.tar.gz (195 kB)
WARNING: Discarding https://files.pythonhosted.org/packages/65/2f/e37b64df0af49afb507e5f9470665381909e00b78c88c73d1e380f83df8c/banditpam-1.0.2.tar.gz#sha256=4cb13dd99c05d3ab797224beba60cda46dee4d96c251ea0f1ad31491cc12339d (from https://pypi.org/simple/banditpam/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Using cached banditpam-1.0.0.tar.gz (194 kB)
WARNING: Discarding https://files.pythonhosted.org/packages/c5/9a/5c21e1ea5a8d1d034c437247e62b81183f10f8cb99c99802dcd097bdd063/banditpam-1.0.0.tar.gz#sha256=8dc3a3e7d2c53e73c5393b95c40a7b912f427dcc34d8b1c03634b7403d626f72 (from https://pypi.org/simple/banditpam/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement banditpam (from versions: 1.0.0, 1.0.2)
ERROR: No matching distribution found for banditpam

opened by kno10 3

Emitting log file causes Python kernel crashing
When testing the Python code on MNIST data with PAM algorithm, adding verbosity = 1 causes issue on the kernel. In particular, the following code causes kernel crashing.

X = pd.read_csv('data/MNIST-1k.csv', sep=' ', header=None).to_numpy() X_tsne = TSNE(n_components = 2).fit_transform(X) kmed = KMedoids(n_medoids = 10, algorithm = "naive", verbosity = 1) kmed.fit(X, 'L2', 10, "naive_v1_mnist_log")

The above code runs properly if the verbosity = 1 is removed. If we change the algorithm to "BanditPAM", the verbosity = 1 does not cause any issue and the log file is generated properly.
opened by mailology 3
Much Slower than Scikit-Learn kMedoids on Large Dimensions

Hello!

We have a problem where we need to select a subset of thousand items from hundred items using k-Medoids clustering of embeddings (512 dim).

We were using k-Medoids implementation in Scikit-Learn. We tried BanditPAM recently as we thought it to be a much faster method but that wasn't the case.

We wanted to ask here to make sure before we eliminate it from our options. We are looking forward to your suggestions on what we may be doing wrong.

Have a nice day.

Sincerely, Kamer

opened by kayuksel 2
Allow choosing the random seed (from python)

The algorithm is introduced as randomized, but it appears to return the same results when run multiple times. As far as I can tell, this is because the random generator is not seeded. At least I could not find an invocation of arma::set_seed_random. I would prefer a parameter that allows the (python-) user to set the seed in a reproducible way, i.e., add an option to the function call that is then used to seed the RNG; if not set it could default to seed using the current time.

P.S. Sorry for spamming you with so many issue tickets, but my impression is that this may suit your workflow and may help you keep track of such small TODOs.

opened by kno10 2
error: ‘class arma::Mat’ has no member named ‘n_alloc’

Hello everyone

I wanted to install BanditPAM on Windows first, but I was not able to succeed, so I tried it on Windows Subsystem for Linux. I do not have the LLVM problem anymore that I had on Windows but now I have the following error:

error: ‘class arma::Mat<float>’ has no member named ‘n_alloc’

I tried installing BanditPAM using pip install banditpam as well as cloning the repository and building the carma headers from source and installing BanditPAM locally. Does anyone know how to fix this? The error is in file headers/carma/include/carma_bits/numpytoarma.h on line 133. I guess it could be a problem with CARMA.

Thank you in advance!

Edit: I was able to solve it by changing the line 133 in the mentioned file as follows:

Before: arma::access::rw(dest.n_alloc) = nelem

After: arma::access::rw(dest.n_elem) = nelem

I am not 100% sure if this is correct but it seems to be the only attribute of matrix dest that makes sense (see the documentation). I will make a pull request in the CARMA repository. In the meanwhile, I hope this issue is useful for other people that encounter the same problem.

opened by rolshoven 2
Easily compare the effects of cache and permutation with flags
This pull request adds a new feature that allows the user to easily turn on and off the use of cache and permutation. To compare the effects of these options, the user can run scripts/experiments.py with different configurations.

To install the package and run the default experiments in one go, please run the following command.

/BanditPAM/: bash scripts/reproduce_results.sh

If you want to manually experiment with different conditions, please run the following command after installing the requirements and package.

/BanditPAM/: python scripts/experiment.py [options]

If you don't pass any options, the script will run experiments with n_medoids=[5, 10] and n_data = [10000, 30000].

Options

-k, --n_medoids int/string default: [5, 10] -n, --n_data int/string default: [10000, 30000]

Example : Run experiments with k=3 and n_data = [1000, 3000]
(Make sure to put a list in double quotes)

$ python scripts/experiment.py -k 3 -n "[1000, 3000]" Cache (X) Perm (X) Cache (O) Perm (X) Cache (O) Perm (O) [mnist: 1000 | k: 3] 0.535 (0.041) 0.149 (0.003) 0.146 (0.00565) [mnist: 3000 | k: 3] 1.71 (0.107) 0.738 (0.0476) 0.743 (0.0518)
opened by lukeleeai 1
`useCacheP=True` & `usePerm=False` runs slower than BanditPAM with no caching when the dataset is large

Dataset: 30k MNIST

---Cache: True Perm: True--- 1 / 3 : 89.09640216827393 seconds 2 / 3 : 111.98075819015503 seconds 3 / 3 : 99.45851635932922 seconds mean: 100.17855890591939 std: 9.356362668819777

---Cache: True Perm: False--- 1 / 3 : 165.01068472862244 seconds 2 / 3 : 200.78851699829102 seconds 3 / 3 : 178.79811787605286 seconds mean: 181.53243986765543 std: 14.73365100813906

---Cache: False Perm: False--- 592.7162899971008 seconds

Dataset: 70k MNIST

---Cache: True Perm: True--- (CACHE: 5000) 1 / 3 : 428.34665966033936 seconds 2 / 3 : 384.3007571697235 seconds 3 / 3 : 445.8992736339569 seconds seconds mean: 419.5155634880066 std: 25.911200954443764

---Cache: True Perm: False--- (CACHE: 5000) 3346.914297580719 seconds

---Cache: False Perm: False--- 1 / 3 : 1375.8598430156708 seconds 2 / 3 : 1595.562647819519 seconds 1 / 3 : 1595.562647819519 seconds 2 / 3 : 1530.3325538635254 seconds 3 / 3 : 1296.2060058116913 seconds mean: 1474.0337358315785 std: 128.53214244537276

opened by lukeleeai 0

Releases(v3.0.4)

v3.0.4(Apr 22, 2022)
BanditPAM v3.0.4 contains a few hotfixes:

Organization and Functionality:

Fixes the computation of cosine distance (Fixes #182)

Removes the ability to call OpenMP functions omp_get_max_threads and omp_set_num_threads, which should resolve the remaining issues on M1 Macs (Fixes #167)

Tests: No changes.

Style: No changes.

Documentation: No changes.

Full Changelog: https://github.com/ThrunGroup/BanditPAM/compare/v3.0.3...v3.0.4
Source code(tar.gz)
Source code(zip)
v3.0.3(Feb 7, 2022)
This contains BanditPAM v3.0.3. This update will be largely invisible to users, but allows for building the Linux and Mac (including Apple Silicon/M1) wheels to upload to PyPi.

Organization and Functionality:

Building wheels automatically for Linux, Intel Mac, and M1 Mac and uploading them to PyPI via Github actions

Tests:

None, other than verifying the changes in Organization and Functionality work via Github Actions

Style:

Including newlines between steps of Github Actions

Documentation:

None

Full Changelog: https://github.com/ThrunGroup/BanditPAM/compare/v3.0.2...v3.0.3
Source code(tar.gz)
Source code(zip)
v3.0.2(Jan 18, 2022)
BanditPAM v3.0.2 contains several bugfixes:

Organization and Functionality:

We now allow the user to set a seed for reproducible results (must be called with banditpam.set_num_threads(1) for deterministic reproducibility) (Fixes #176)

We have added the KMedoids.average_loss attribute to contain the final average clustering loss after fitting (Fixes #174)

We throw an std::invalid_argument error properly when specifying an invalid loss function (Fixes #173, Fixes #141)

Tests:

We now also test PAM in tests/test_smaller.py

Style:

We change PAM and FastPAM1 to use this->*lossFn instead of KMedoids::cachedLoss to avoid resetting the cache for them; they do not benefit much from a cache anyway

Nits

Documentation:

Created documentation for new functions

Full Changelog: https://github.com/ThrunGroup/BanditPAM/compare/v3.0.1...v3.0.2
Source code(tar.gz)
Source code(zip)
v3.0.1(Jan 8, 2022)
BanditPAM v3.0.1 contains a hotfix to ensure it can be installed on Paperspace Gradient and Google Colab.

For Paperspace Gradient:

allows users to installbanditpam==3.0.1 on Paperspace Gradient instances by installing the necessary dependencies and armadillo 10.8 automatically in setup.py

Builds a recent (>=10.8) armadillo from source

For Google Colab:

Installs the necessary Ubuntu dependencies

Fixes a missing space that was conjoining the repo name with the local installation path

Replaces the MANIFEST.in so the headers are properly included in the source distribution

Full Changelog: https://github.com/ThrunGroup/BanditPAM/compare/v3.0.0...v3.0.1
Source code(tar.gz)
Source code(zip)
v3.0.0(Dec 28, 2021)
BanditPAM v3.0.0 contains several changes:

Organization and Functionality:

doubles are changed to floats throughout

Tests:

Python3.10 has been added to the list of python versions to check

We now verify the package can be built on MacOS

We separate the different tests into different files

Style:

We use the appropriate armadillo types throughout for floats

Documentation:

We have updated the documentation through

We updated the favicon on readthedocs

We have updated the installation guides throughout](https://github.com/ThrunGroup/BanditPAM/releases/new)

Full Changelog: https://github.com/ThrunGroup/BanditPAM/compare/v2.0.0...v3.0.0
Source code(tar.gz)
Source code(zip)
v2.0.0(Dec 28, 2021)
BanditPAM v2.0.0 PR. Contains many changes:

Organization and Functionality:

Everything has been migrated to the namespace km for better encapsulation (Fixes #135)

We now allow for <100 datapoints by setting the batchSize to min(dataset_size, 100) (Fixes #158)

We now return ints for the medoid indices instead of floats, including a list of a single int if k=1 (Fixes #152)

We reformatted the functions in each .cpp file to appear in the same order they appear in the corresponding .hpp

The code has been refactored for better organization

The code in setup.py is now encapsulated (Fixes #131)

The check for LLVM clang is now back in setup.py (Fixes #79)

Attempting to set the build or swap confidences when not using the BanditPAM algorithm results in an error

Tests:

The code's accuracy is now automatically checked by running test cases via Github actions (Fixes #52)

An error is now thrown if an empty dataset is passed

We now use FastPAM1 instead of PAM for the tests, which significantly speeds them up

We added additional functionality to the tests to error quickly on failures

The code is now tested on Python3.9 via Github actions (Fixes #46)

Style:

We have changed variable names to camelCase for C++ variables (Fixes #140)

The code is now automatically checked for style compliance via Github actions

The python code now contains typehints

const qualifiers have been added where possible

Documentation:

We now publicly host the documentation on ReadTheDocs, via an integration with Sphinx (Fixes #165, Fixes #124)

We have updated the README with links to the ReadTheDocs and SAIL blog post

We have deleted duplicate docstrings in the .cpp files, moved all docstrings to the .hpp files, updated the docstrings, and added the necessary @throws and @returns (Fixes #127)

Full Changelog: https://github.com/ThrunGroup/BanditPAM/compare/v1.0.5...v2.0.0
Source code(tar.gz)
Source code(zip)
v1.0.5(Dec 21, 2021)
BanditPAM v1.0.5 Release Notes:

Removes all logging and the verbosity flag, which were unnecessary

Enables users to install the package via pip on Google Colab (by using prebuilt armadillo libraries and copying them over to the correct places)

Bumps the version to v1.0.5

Cleans up some nits

Source code(tar.gz)
Source code(zip)
v1.0.2(Dec 18, 2021)

Creating the first tag, v1.0.2 to keep track of releases.

Tag created by @motiwari .
Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository

Highly comparative time-series analysis

〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu

569 Dec 21, 2022

Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space"

Sparse Steerable Convolution (SS-Conv) Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and

25 Dec 21, 2022

Plenoxels: Radiance Fields without Neural Networks, Code release WIP

Plenoxels: Radiance Fields without Neural Networks Alex Yu*, Sara Fridovich-Keil*, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa UC Be

2.3k Dec 30, 2022

Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2

CoaDTI Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2 Abstract Environment The test was conducted i

7 Nov 14, 2022

Instant Real-Time Example-Based Style Transfer to Facial Videos

FaceBlit: Instant Real-Time Example-Based Style Transfer to Facial Videos The official implementation of FaceBlit: Instant Real-Time Example-Based Sty

131 Dec 19, 2022

Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

Multidimensional LSTM BitCoin Time Series Using multidimensional LSTM neural networks to create a forecast for Bitcoin price. For notes around this co

318 Dec 14, 2022

Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN"

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

68 Dec 21, 2022

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

ObjectDrawer-ToolBox is a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system, Object Drawer.

77 Jan 05, 2023

Chainer implementation of recent GAN variants

Chainer-GAN-lib This repository collects chainer implementation of state-of-the-art GAN algorithms. These codes are evaluated with the inception score

399 Oct 23, 2022

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

🌎 JSONClasses JSONClasses is a declarative data flow pipeline and data graph framework. Official Website: https://www.jsonclasses.com Official Docume

53 Dec 09, 2022

Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

This repository contains code to reproduce results for submission NeurIPS 2021, "Momentum Centering and Asynchronous Update for Adaptive Gradient Meth

15 Jun 11, 2022

Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

A tutorial showing how to set up TensorFlow's Object Detection API on the Raspberry Pi

1.1k Dec 26, 2022

This script scrapes and stores the availability of timeslots for Car Driving Test at all RTA Serivce NSW centres in the state.

This script scrapes and stores the availability of timeslots for Car Driving Test at all RTA Serivce NSW centres in the state. Dependencies Account wi

21 Dec 14, 2022

PyTorch implementation of the Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning This is the official PyTorch implementation of the ContrastiveCrop paper: @artic

249 Dec 28, 2022

A PyTorch Implementation of Single Shot MultiBox Detector

SSD: Single Shot MultiBox Object Detector, in PyTorch A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragom

4.8k Jan 07, 2023

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

17 Nov 29, 2022

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022

Use your Philips Hue lights as Racing Flags. Works with Assetto Corsa, Assetto Corsa Competizione and iRacing.

phue-racing-flags Use your Philips Hue lights as Racing Flags. Explore the docs » Report Bug · Request Feature Table of Contents About The Project Bui

50 Sep 03, 2022

Easy and Efficient Object Detector

EOD Easy and Efficient Object Detector EOD (Easy and Efficient Object Detection) is a general object detection model production framework. It aim on p

381 Jan 01, 2023

BanditPAM: Almost Linear-Time k-Medoids Clustering

Related tags

Overview

BanditPAM: Almost Linear-Time k-Medoids Clustering

Requirements

TL;DR run pip3 install banditpam and jump to the examples.

Python Quickstart

Install the repo and its dependencies:

Example 1: Synthetic data from a Gaussian Mixture Model

Example 2: MNIST and its medoids visualized via t-SNE

Documentation

Building the C++ executable from source

Option 1: Building with Docker

Option 2: Installing Requirements and Building Directly

C++ Usage

Implementing a custom distance metric

Testing

Reproducing Figures from the Paper

Credits

Comments

Dataset: 30k MNIST

Dataset: 70k MNIST

Releases(v3.0.4)

v3.0.4(Apr 22, 2022)

v3.0.3(Feb 7, 2022)

Organization and Functionality:

Tests:

Style:

Documentation:

v3.0.2(Jan 18, 2022)

v3.0.1(Jan 8, 2022)

v3.0.0(Dec 28, 2021)

v2.0.0(Dec 28, 2021)

v1.0.5(Dec 21, 2021)

v1.0.2(Dec 18, 2021)

Owner

Highly comparative time-series analysis

Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space"

Plenoxels: Radiance Fields without Neural Networks, Code release WIP

Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2

Instant Real-Time Example-Based Style Transfer to Facial Videos

Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN"

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

Chainer implementation of recent GAN variants

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

This script scrapes and stores the availability of timeslots for Car Driving Test at all RTA Serivce NSW centres in the state.

PyTorch implementation of the Crafting Better Contrastive Views for Siamese Representation Learning

A PyTorch Implementation of Single Shot MultiBox Detector

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Classifying audio using Wavelet transform and deep learning

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

Use your Philips Hue lights as Racing Flags. Works with Assetto Corsa, Assetto Corsa Competizione and iRacing.

Easy and Efficient Object Detector

TL;DR run `pip3 install banditpam` and jump to the examples.