GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

Related tags

Deep Learninggrape
Overview

images/GRAPE.jpg

GraPE

Pypi project Pypi total project downloads

GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both off-the-shelf laptop and desktop computers and High Performance Computing clusters of workstations.

The library is written in Rust and Python programming languages, and has been developed by AnacletoLAB (Dept.of Computer Science of the University of Milan), in collaboration with the RobinsonLab (Jackson Laboratory for Genomic Medicine) and the BPOP (Lawrence Berkeley National Laboratory).

GraPE is composed of two main modules: Ensmallen (ENabler of SMALL runtimE and memory Needs) and Embiggen (EMBeddInG GENerator), that run synergistically using parallel computation and efficient data structures.

Ensmallen efficiently executes graph processing operations including large-scale first and second-order random walks, while Embiggen leverages the large amount of sampled random walks generated by Ensmallen by computing effective node and edge embeddings. Beside being helpful for unsupervised exploratory analysis of graphs, the computed embeddings can be used for trainining any of the flexible neural models for edge and node label prediction, provided by Embiggen itself.

The following figure shows the main relationships between Ensmallen and Embiggen modules:

images/link_prediction_model.png

Installation of GraPE

For most computers you can just download it using pip:

pip install grape

Since Ensmallen is written in Rust, on PyPi we distribute pre-compiled packages for Windows, Linux, MacOs for the Python version 3.6, 3.7, 3.8, 3.9 for x86_64 cpus.

For the Linux binaires we follow the Python's ManyLinux2010 (PEP 571) standard which requires libc version >= 2.12, this version was releasted in 03/08/2010 so any Linux System in the last ten years should be compatible. To check your current libc version you can run ldd --version.

We also assume that the cpu has the following features: sse, sse2, ssse3, sse4_1, sse4_2, avx, avx2, bmi1, bmi2, popcnt. If these features are not present, you cannot use the PyPi pre-compiled binaries and you have to manually compile Ensmallen (Guide) . On Linux you can check if your CPU supports these features by running cat /proc/cpuinfo and ensuring that all these features are presents under the flags section. While these features are not strictly required, they significanly speed-up the executions and should be supported by any x86_64 CPU newer than Intel's Haswell architecture (2013).

If the your CPU doesn't support them you will get, on import, a ValueError exception with the following message:

This library was compiled assuming that SIMD instruction commonly available in CPU hardware since 2013 are present
on the machine where this library is intended to run.
On the current machine, the flags <MISSING_FLAGS> are not available.
You could still compile Ensmallen on this machine and have a version of the library that can execute here, but the
library has been extensively designed to use SIMD instructions, so you would have a version slower than the one
provided on Pypi.

These requirements were chosen to provide a good tradeoff between compatability and performance. If your system is not compatible, you can manually compile Ensmallen for any Os, libc version, and CPU architecture (such as Arm, AArch64, RiscV, Mips) which are supported by Rust and LLVM. Manually compiling Ensmallen might require more than half an hour and around 10Gb of RAM, if you encounter any error during the installation and/or compilation feel free to open an Issue here on Github and we will help troubleshoot it.

Main functionalities of the library

  • Robust graph loading and automatic graph retrieval:

    • More than 13000 graphs directly available from the library for benchmarking
    • Support for multiple graph formats
    • Automatic human readable reports of format errors
    • Automatic human readable reports of the main graph characteristics
  • Random walks:

    • Exact and approximated first and second order random walks
    • Massive generation of sampled random walks for graph embedding
    • Automatic dispatching of 8 optimized random walk algorithms depending on the parameters of the random walk and the type (weighted/unweighted) of the graph
  • Node embedding models:

    • SkipGram
    • CBOW
    • GloVe
  • Edge and node prediction models:

    • Perceptron
    • Multi-Layer Perceptron
    • Deep Neural Networks
  • Preprocessing for node embedding and edge prediction:

    • Lazy generation of skip-grams from random walks
    • Lazy generation of balanced batches for edge prediction
    • GloVe co-occurence matrix computation
  • Graph processing operations:

    • Optimized filtering by node, edge and components characteristics
    • Optimized algebraic set operations on graphs
    • Automatic generation of reports summarizing graph features in natural language
  • Graph algorithms:

    • Breadth and Depth-first search
    • Dijkstra, Tarjan's strongly connected component
    • Efficient Diameter computation, spanning arborescence and connected components
    • Approximated vertex cover, triads counting, transitivity, clustering coefficient and triangles counting
    • Betweenness and stress centrality, Closeness and harmonic centrality
  • Graph visualization tools: visualization of node and edge properties

Tutorials

You can find tutorials covering various aspects of the GraPE library here. All tutorials are as self-contained as possible and can be immediately executed on COLAB.

If you want to get quickly started, after having installed GraPE from Pypi as described above, you can try running the following example using the SkipGram embedding model on the Cora-graph:

from ensmallen.datasets.linqs import Cora
from ensmallen.datasets.linqs.parse_linqs import get_words_data
from embiggen.pipelines import compute_node_embedding
from embiggen.visualizations import GraphVisualization
import matplotlib.pyplot as plt

# Dowload, load up the graph and its node features
graph, node_features = get_words_data(Cora())

# Compute a SkipGram node embedding, using a second-order random walk sampling
node_embedding, training_history = compute_node_embedding(
    graph,
    node_embedding_method_name="SkipGram",
    # Let's increase the probability of explore the local neighbourhood
    return_weight=2.0,
    explore_weight=0.1
)

# Visualize the obtained node embeddings
visualizer = GraphVisualization(graph, node_embedding_method_name="SkipGram")
visualizer.fit_transform_nodes(node_embedding)

visualizer.plot_node_types()
plt.show()

You can see a tutorial detailing the above script here, and you can run it on COLAB from here.

Documentation

On line documentation

The on line documentation of the library is available here. Since Ensmallen is written in Rust, and PyO3 (the crate we use for the Python bindings), doesn't support typing, the documentation is obtained generating an empty skeleton package. This allows to have a proper documentation but you won't be able to see the source-code in it.

Using the automatic method suggestions utility

To aid working with the library, Grape provides an integrated recommender system meant to help you either to find a method or, if a method has been renamed for any reason, find its new name.

As an example, after having loaded the STRING Homo Sapiens graph, the function for computing the connected components can be retrieved by simply typing components as follows:

from ensmallen.datasets.string import HomoSapiens

graph = HomoSapiens()
graph.components

The code above will raise the following error, and will suggest methods with a similar or related name:

AttributeError                            Traceback (most recent call last)
<ipython-input-3-52fac30ac7f6> in <module>()
----> 2 graph.components

AttributeError: The method 'components' does not exists, did you mean one of the following?
* 'remove_components'
* 'connected_components'
* 'strongly_connected_components'
* 'get_connected_components_number'
* 'get_total_edge_weights'
* 'get_mininum_edge_weight'
* 'get_maximum_edge_weight'
* 'get_unchecked_maximum_node_degree'
* 'get_unchecked_minimum_node_degree'
* 'get_weighted_maximum_node_degree'

In our example the method we need for computing the graph components would be connected_components.

Now the easiest way to get the method documentation is to use Python's help as follows:

help(graph.connected_components)

And the above will return you:

connected_components(verbose) method of builtins.Graph instance
Compute the connected components building in parallel a spanning tree using [bader's algorithm](https://www.sciencedirect.com/science/article/abs/pii/S0743731505000882).

**This works only for undirected graphs.**

The returned quadruple contains:
- Vector of the connected component for each node.
- Number of connected components.
- Minimum connected component size.
- Maximum connected component size.

Parameters
----------
verbose: Optional[bool]
    Whether to show a loading bar or not.


Raises
-------
ValueError
    If the given graph is directed.
ValueError
    If the system configuration does not allow for the creation of the thread pool.

You can try to run the code described above on COLAB.

Cite GraPE

Please cite the following paper if it was useful for your research:

@misc{cappelletti2021grape,
  title={GraPE: fast and scalable Graph Processing and Embedding},
  author={Luca Cappelletti and Tommaso Fontana and Elena Casiraghi and Vida Ravanmehr and Tiffany J. Callahan and Marcin P. Joachimiak and Christopher J. Mungall and Peter N. Robinson and Justin Reese and Giorgio Valentini},
  year={2021},
  eprint={2110.06196},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

If you believe that any example may be of help, do feel free to open a GitHub issue describing what we are missing in this tutorial.

Comments
  • TransE error:

    TransE error: "ValueError: One of the provided node embedding computed with the TransE method contains NaN values."

    When generating embeddings for KG-Microbe (KGX edge file from KG-Hub) using TransE, the following error was observed:

    ValueError Traceback (most recent call last) in ----> 1 embedding = model.fit_transform(kg)

    ~/Library/Python/3.7/lib/python/site-packages/cache_decorator/cache.py in wrapped(*args, **kwargs) 595 if not cache_enabled: 596 self.logger.info("The cache is disabled") --> 597 result = function(*args, **kwargs) 598 self._check_return_type_compatability(result, self.cache_path) 599 return result

    ~/Library/Python/3.7/lib/python/site-packages/embiggen/utils/abstract_models/abstract_embedding_model.py in fit_transform(self, graph, return_dataframe, verbose) 164 graph=graph, 165 return_dataframe=return_dataframe, --> 166 verbose=verbose 167 ) 168

    ~/Library/Python/3.7/lib/python/site-packages/embiggen/embedders/ensmallen_embedders/transe.py in _fit_transform(self, graph, return_dataframe, verbose) 112 embedding_method_name=self.model_name(), 113 node_embeddings= node_embedding, --> 114 edge_type_embeddings= edge_type_embedding, 115 ) 116

    ~/Library/Python/3.7/lib/python/site-packages/embiggen/utils/abstract_models/embedding_result.py in init(self, embedding_method_name, node_embeddings, edge_embeddings, node_type_embeddings, edge_type_embeddings) 76 if np.isnan(numpy_embedding).any(): 77 raise ValueError( ---> 78 f"One of the provided {embedding_list_name} " 79 f"computed with the {embedding_method_name} method " 80 "contains NaN values."

    ValueError: One of the provided node embedding computed with the TransE method contains NaN values.

    I am attaching a jupyter notebook to reproduce the problem. load_graph_and.ipynb.zip

    The input edge file is here: https://kg-hub.berkeleybop.io/kg-microbe/current/kg-microbe.tar.gz

    opened by realmarcin 7
  • Need documentation on how to use a knowledge graph in grape

    Need documentation on how to use a knowledge graph in grape

    Hello, I have another question on how to import my data in grape. I think it is more a clarification on my method to import my KG.

    kg = Graph.from_csv(directed=True,
                           edge_path="sample_mabkg.tsv",
                           sources_column_number= 0,
                           edge_list_edge_types_column_number=1,edge_list_separator="|",
                           destinations_column_number=2, name="mAbKG", verbose=True, edge_list_header=True)
    

    but i saw that it exists node_path and other properties like in edge_path, so i don't know if i did in the good way my read from_csv. Can you please give me some explanation knowning i have a KG (with edge and node typed). Below is an example of my data.

    Thank you for your answer

    Gaoussou

     node source|edge|node destination
    _:B4dff5e7d17225b25b13ad12737e49779|imgt:isDecidedBy|imgt:EC
    pubmed:2843774|dc:title|Selective killing of HIV-infected cells by recombinant human CD4-Pseudomonas exotoxin hybrid protein.
    imgt:Product_8e9250cf-276a-3282-954f-3791316ac5a6|rdf:type|obo:NCIT_C51980
    imgt:Segment_212_1|obo:BFO_0000050|imgt:Construct_212
    imgt:IgG4-kappa_1001|rdfs:label|IgG4-kappa_1001
    imgt:V-D-GENE|owl:sameAs|obo:SO_0000510
    imgt:Segment_536_1|rdf:type|imgt:Segment
    imgt:LRR13|rdf:type|imgt:RepeatLabel
    imgt:StudyProduct_c2bc9b3a-a15e-376f-bda5-f87089b3f54b|imgt:application_type|Therapeutic
    imgt:StudyProduct_54a14ca8-f916-338b-af18-d079beb598a4|imgt:development_technology|  Dyax human antibody phage display library 
    

    sample_mabkg.txt

    opened by gsanou 6
  • embiggen package error under Windoze

    embiggen package error under Windoze

    The joy on installation on Windoze...

    Collecting embiggen>=0.11.9
      Downloading embiggen-0.11.38.tar.gz (154 kB)
         ---------------------------------------- 154.2/154.2 kB ? eta 0:00:00
      Preparing metadata (setup.py) ... error
      error: subprocess-exited-with-error
    
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [10 lines of output]
          Traceback (most recent call last):
            File "<string>", line 2, in <module>
            File "<pip-setuptools-caller>", line 34, in <module>
            File "C:\cygwin64\tmp\pip-install-37lyy1_b\embiggen_3ec9ca91df6044b1b2470bb84cb6184d\setup.py", line 54, in <module>
              long_description=readme(),
            File "C:\cygwin64\tmp\pip-install-37lyy1_b\embiggen_3ec9ca91df6044b1b2470bb84cb6184d\setup.py", line 12, in readme
              return f.read()
            File "C:\Users\richa\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
              return codecs.charmap_decode(input,self.errors,decoding_table)[0]
          UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 2: character maps to <undefined>
          [end of output]
    
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.
    
    
    opened by RichardBruskiewich 6
  • Bipartite Graph predict proba with undirected graph

    Bipartite Graph predict proba with undirected graph

    Hi. I noticed the performance metrics are not identical when using predict_proba_bipartite_graph_from_edge_node_types, when I swap the source and destination nodes. The graph used as input is an undirected graph, which I would expect would yield similar predictions for the same edge type regardless of which is source and destination nodes. Is this behavior intentional?

    Below are the version of the software I am running currently: grape==0.1.17 embiggen==0.11.27 ensmallen==0.8.14

    opened by arpelletier 6
  • ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory

    ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory

    In a fresh notebook, attempting to import grape yields an ImportError about a missing libgfortran-ed201abd.so.3.0.0.

    >>> !pip install grape -U
    >>> import grape
    /usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.8) or chardet (3.0.4) doesn't match a supported version!
      warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
    Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?276a3afe-1b97-4f33-82e6-6df2db01934a)
    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    /home/harry/kg-bioportal/data/merged/KG-Bioportal analysis.ipynb Cell 2' in <cell line: 1>()
    ----> [1](vscode-notebook-cell://wsl%2Bubuntu-20.04/home/harry/kg-bioportal/data/merged/KG-Bioportal%20analysis.ipynb#ch0000001vscode-remote?line=0) import grape
    
    File ~/.local/lib/python3.8/site-packages/grape/__init__.py:9, in <module>
          1 """GraPE main module.
          2 
          3 For now, this is a simple wrapper of GraPE main two sub-modules that for
       (...)
          6 These packages are mimed here by the two sub-directories, ensmallen and embiggen.
          7 """
    ----> 9 from embiggen import *
         10 from ensmallen import Graph
         13 def import_all(module_locals):
    
    File ~/.local/lib/python3.8/site-packages/embiggen/__init__.py:2, in <module>
          1 """Module with models for graph machine learning and visualization."""
    ----> 2 from embiggen.visualizations import GraphVisualizer
          3 from embiggen.utils import (
          4     EmbeddingResult,
          5     get_models_dataframe,
       (...)
          9     get_available_models_for_node_embedding,
         10 )
    ...
        691     'spherical_kn',
        692 ]
        694 from scipy._lib._testutils import PytestTester
    
    ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory
    

    I've seen that this may be related to libraries packaged with numpy, as seen in the following: https://github.com/ContinuumIO/anaconda-issues/issues/445 https://github.com/numpy/numpy/issues/14348

    This may be environment-specific, of course.

    opened by caufieldjh 6
  • `Illegal instruction (core dumped)` on importing grape

    `Illegal instruction (core dumped)` on importing grape

    In another issue that may have something to do with our aging build server: When we import grape in this environment (see info below), we get only Illegal instruction (core dumped).

    cpuinfo output:

    processor       : 23
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 44
    model name      : Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz
    stepping        : 2
    microcode       : 0x1f
    cpu MHz         : 1599.987
    cache size      : 12288 KB
    physical id     : 1
    siblings        : 12
    core id         : 10
    cpu cores       : 6
    apicid          : 53
    initial apicid  : 53
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 11
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d
    bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
    bogomips        : 6133.21
    clflush size    : 64
    cache_alignment : 64
    address sizes   : 40 bits physical, 48 bits virtual
    power management:
    
    opened by caufieldjh 4
  • Link to the two sub packages

    Link to the two sub packages

    Hi, first of all, thanks for making such an amazing graph embedding resource!

    I'm wondering whether you can add some descriptions in the README clarifying that this repo is a thin wrapper of the two core packages embiggen and ensmallen and add links accordingly. I was a bit confused for a few minutes trying to find the source code and only came to realize it wraps the two libraries after looking at __init__.py.

    opened by RemyLau 4
  • pip install grape failure on support_luca>=1.0.2

    pip install grape failure on support_luca>=1.0.2

    I am attempting to install grape using pip on Ubuntu 20.04.4 LTS with python 3.8.3.

    Most of the build/install appears to work just fine until I hit this error, providing a little additional context. I have also tried to install ensmallen directly with pip install ensmallen and I get the same error. Any advice you have would be appreciated.

    Requirement already satisfied: idna<3,>=2.5 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (2.10)
    Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (1.25.9)
    Requirement already satisfied: certifi>=2017.4.17 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (2020.6.20)
    Requirement already satisfied: chardet<4,>=3.0.2 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (3.0.4)
    Collecting typing-extensions>=3.7.4.3
      Using cached typing_extensions-4.3.0-py3-none-any.whl (25 kB)
    ERROR: Could not find a version that satisfies the requirement support_luca>=1.0.2 (from dict_hash>=1.1.25->cache_decorator>=2.1.11->ensmallen>=0.8.21->grape) (from versions: none)
    ERROR: No matching distribution found for support_luca>=1.0.2 (from dict_hash>=1.1.25->cache_decorator>=2.1.11->ensmallen>=0.8.21->grape)
    
    opened by amc-corey-cox 4
  • Graph visualization error

    Graph visualization error

    Hello. I am trying the Using CBOW to embed Cora python notebook (linked) and after replacing "CBOWEnsmallen" with "DeepWalkCBOWEnsmallen", the first order embedding runs successfully but fails at the graph visualization. I get the following error:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    /tmp/ipykernel_3453/3695275499.py in <module>
    ----> 1 GraphVisualizer(
          2     graph,
          3     node_embedding_method_name="CBOW - First order"
          4 ).fit_and_plot_all(first_embedding)
    
    ~/anaconda3/lib/python3.9/site-packages/embiggen/visualizations/graph_visualizer.py in fit_and_plot_all(self, node_embedding, number_of_columns, show_letters, include_distribution_plots, **node_embedding_kwargs)
       4236         distribution_plot_methods_to_call = []
       4237 
    -> 4238         if not self._graph.has_constant_non_zero_node_degrees():
       4239             node_scatter_plot_methods_to_call.append(
       4240                 self.plot_node_degrees,
    
    AttributeError: The method 'has_constant_non_zero_node_degrees' does not exists, did you mean one of the following?
    * 'has_constant_edge_weights'
    * 'get_non_zero_subgraph_node_degrees'
    * 'has_nodes'
    * 'has_edges'
    * 'has_selfloops'
    * 'has_node_ontologies'
    * 'has_node_oddities'
    * 'get_node_degrees'
    * 'has_node_name'
    * 'has_node_types'
    

    Looks like the issue has to do with embiggen dependencies in the graph visualization. Below are the package versions I am using: embiggen==0.11.13 ensmallen==0.8.7 grape==0.1.9

    As well, I was not able to successfully run the second-order embeddings

    model = DeepWalkCBOWEnsmallen(
        return_weight=2.0,
        explore_weight=0.1
    )
    second_embedding = model.fit_transform(graph).get_node_embedding_from_index(0)
    

    The above code gives the below error:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    /tmp/ipykernel_3453/3112314827.py in <module>
    ----> 1 model = DeepWalkCBOWEnsmallen(
          2     return_weight=2.0,
          3     explore_weight=0.1
          4 )
          5 second_embedding = model.fit_transform(graph).get_node_embedding_from_index(0)
    
    TypeError: __init__() got an unexpected keyword argument 'return_weight'```
    opened by arpelletier 4
  • Embedding model names not recognized; alternate suggestions are unexpected

    Embedding model names not recognized; alternate suggestions are unexpected

    As of grape 0.1.9, node embedding model names have changed, such that a call to embiggen's AbstractModel.get_task_data(model_name, task_name) with one of the frequently used model names like CBOW or SkipGram throws a ValueError.

    I see from grape.get_available_models_for_node_embedding() that these now have more specific names like Node2Vec CBOW. No problem with being specific, but we'd still like to be able to specify CBOW, SkipGram, or GloVe in config definitions without having to verify the exact model names embiggen is expecting first. Could we use the short names as aliases to a default model, like CBOW will be understood as Node2Vec CBOW, etc?

    The name convention also appears to confuse the alternative suggests provided in the ValueError text, so we get suggestions like this:

    ValueError: The provided model name `CBOW` is not available. Did you mean BoxE?
    
    opened by caufieldjh 4
  • ValueError when trying to use external embedder like in pykeen and karateClub

    ValueError when trying to use external embedder like in pykeen and karateClub

    Hello, Thanks you for your amaeing work, i'm a phD student working on the embeddings of biomedical data particularly in immunogenetics, and currently i'm comparing tools to embed data. I found your works very interesting. I got some issues when i try to use external model from pykeen and karateclub. i got this message :
    ValueError: We have found an useless method in the class StubClass, implementing method HolE from library PyKEEN and task Node Embedding. It does not make sense to implement the `requires_positive_edge_weights` method when the `can_use_edge_weights` always returns False, as it is already handled in the root abstract model class.

    Also for the vizualisation, when i did ```  from grape import GraphVisualizer visualizer = GraphVisualizer(kg.remove_disconnected_nodes()) visualizer.fit_and_plot_all(embedding)

    I got this warning without no visualisation:  FutureWarning: The parameter `square_distances` has not effect and will be removed in version 1.3.
    Thank you in advance for your answer
    Gaoussou
    opened by gsanou 3
  • Use case regarding Customer Analytics or Community detection?

    Use case regarding Customer Analytics or Community detection?

    Thanks for that repo. It seems that you have integrated several tools / libraries / approaches under Grape's hood. Do you intend to create a tutorial for a customer analytics recommendation?

    Thanks in advance.

    opened by stkarlos 2
  • Parallelized Embedding

    Parallelized Embedding

    Hey, I'm trying to process a directed graph, the scales are about 5 million nodes and 100 million edges. I've managed to load the graph from a csv file, i get a very nice Graph object (within 5 minutes). I'm now trying to embedd the graph with grape.embedders.Node2VecSkipGramEnsmallen, but it doesn't seem to succeed, I've let it run for over 10 hours. In order to make it faster, i did enable the Graph's vector_source, vector_cumulative_node_degree and vector_reciprocal_sqrt_degrees. Reading your paper, it seems that the embedding process could be parallelized, but i can't find the way to do that. I'd appreciate if you could describe what part/s of the embedding process are parallelized? and how can i make it run in parallel? Thank you, Bruria.

    opened by bruriah1999 2
  • Getting figure to be inline

    Getting figure to be inline

    matplotlib plots figures inline by default or if we write

    %matplotlib inline
    

    Some of the figures produced by GRAPE get put into "subwindows" in the Jupyter notebook, and one needs to scroll up and down to see the entire figure. GRAPE does not seem to be responsive to the inline magic command above either.

    For instance, in order for a certain figure to really appear online, I need to make it much smaller

    visualizer = GraphVisualizer(sli_graph, automatically_display_on_notebooks=False)
    fig, ax, cap = visualizer.plot_node_degree_distribution()
    fig.set_figheight(3)
    fig.set_figwidth(3)
    

    even though the notebook could comfortably show (5,5) or even (8,8)

    opened by pnrobinson 2
  • Saving classifier models

    Saving classifier models

    Could support for saving classifier models please be added? This came up while meeting with @LucaCappelletti94 recently but it's become relevant again in the course of updating neat-ml to use grape classifiers.

    Training classifiers isn't a major time commitment, but on our neat runs we've separated the process of training+testing vs. applying classifiers, so being unable to save or at least pickle the classifier object means we need to redo training for each model.

    opened by caufieldjh 4
  • Methods for generating node embeddings from word embeddings

    Methods for generating node embeddings from word embeddings

    While updating NEAT to use the most recent grape release, @justaddcoffee and @hrshdhgd and I took a look at what we're using to generate node embeddings based on pretrained word embeddings like BERT etc. : https://github.com/Knowledge-Graph-Hub/NEAT/blob/main/neat/graph_embedding/graph_embedding.py

    We know we can run something like get_okapi_tfidf_weighted_textual_embedding() on a graph, but is there a more "on demand" way to run this in grape now for an arbitrary graph?

    opened by caufieldjh 10
Releases(0.0.6.dev1)
Owner
AnacletoLab
Computational Biology and Bioinformatics Lab - Dept. of Computer Science - UNIMI
AnacletoLab
Code for our CVPR 2021 paper "MetaCam+DSCE"

Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification (CVPR'21) Introduction Code for our CVPR 2021

FlyingRoastDuck 59 Oct 31, 2022
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

NAVER AI 262 Dec 27, 2022
CONetV2: Efficient Auto-Channel Size Optimization for CNNs

CONetV2: Efficient Auto-Channel Size Optimization for CNNs Exciting News! CONetV2: Efficient Auto-Channel Size Optimization for CNNs has been accepted

Mahdi S. Hosseini 3 Dec 13, 2021
Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Manga Filling with ScreenVAE SIGGRAPH ASIA 2020 | Project Website | BibTex This repository is for ScreenVAE introduced in the following paper "Manga F

30 Dec 24, 2022
PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

Saim Wani 4 May 08, 2022
[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

SSH: A Self-Supervised Framework for Image Harmonization (ICCV 2021) code for SSH Representative Examples Main Pipeline RealHM DataSet Google Drive Pr

VITA 86 Dec 02, 2022
Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot. Graph Convolutional Networks for Hyperspectral Image Classification, IEEE TGRS, 2021.

Graph Convolutional Networks for Hyperspectral Image Classification Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot T

Danfeng Hong 154 Dec 13, 2022
Multi-tool reverse engineering collaboration solution.

CollaRE v0.3 Intorduction CollareRE is a tool for collaborative reverse engineering that aims to allow teams that do need to use more then one tool du

105 Nov 27, 2022
TransGAN: Two Transformers Can Make One Strong GAN

[Preprint] "TransGAN: Two Transformers Can Make One Strong GAN", Yifan Jiang, Shiyu Chang, Zhangyang Wang

VITA 1.5k Jan 07, 2023
Large-scale Hyperspectral Image Clustering Using Contrastive Learning, CIKM 21 Workshop

Spectral-spatial contrastive clustering (SSCC) Yaoming Cai, Yan Liu, Zijia Zhang, Zhihua Cai, and Xiaobo Liu, Large-scale Hyperspectral Image Clusteri

Yaoming Cai 4 Nov 02, 2022
The backbone CSPDarkNet of YOLOX.

YOLOX-Backbone The backbone CSPDarkNet of YOLOX. In this project, you can enjoy: CSPDarkNet-S CSPDarkNet-M CSPDarkNet-L CSPDarkNet-X CSPDarkNet-Tiny C

Jianhua Yang 9 Aug 22, 2022
Spatial color quantization in Rust

rscolorq Rust port of Derrick Coetzee's scolorq, based on the 1998 paper "On spatial quantization of color images" by Jan Puzicha, Markus Held, Jens K

Collyn O'Kane 37 Dec 22, 2022
NVIDIA Deep Learning Examples for Tensor Cores

NVIDIA Deep Learning Examples for Tensor Cores Introduction This repository provides State-of-the-Art Deep Learning examples that are easy to train an

NVIDIA Corporation 10k Dec 31, 2022
Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

GSL - Zero-shot Synthesis with Group-Supervised Learning Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD,

Andy_Ge 62 Dec 21, 2022
This repository contains code, network definitions and pre-trained models for working on remote sensing images using deep learning

Deep learning for Earth Observation This repository contains code, network definitions and pre-trained models for working on remote sensing images usi

Nicolas Audebert 447 Jan 05, 2023
Trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI

Introduction This script trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI. In order to run this

Momin Haider 0 Jan 02, 2022
[ECCV 2020] XingGAN for Person Image Generation

Contents XingGAN or CrossingGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evaluation Acknowl

Hao Tang 218 Oct 29, 2022
Yolov5 + Deep Sort with PyTorch

딥소트 수정중 Yolov5 + Deep Sort with PyTorch Introduction This repository contains a two-stage-tracker. The detections generated by YOLOv5, a family of obj

1 Nov 26, 2021
Optimizers-visualized - Visualization of different optimizers on local minimas and saddle points.

Optimizers Visualized Visualization of how different optimizers handle mathematical functions for optimization. Contents Installation Usage Functions

Gautam J 1 Jan 01, 2022
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Kento Nishi 22 Jul 07, 2022