A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library

Overview

gdsclient

NOTE: This is a work in progress and many GDS features are known to be missing or not working properly.

This repo hosts the sources for gdsclient, a Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library. gdsclient enables users to write pure Python code to project graphs, run algorithms, and define and use machine learning pipelines in GDS. The API is designed to mimic the GDS Cypher procedure API, but in Python code. It abstracts the necessary operation of the Neo4j Python Driver to offer a simpler surface.

Please leave any feedback as issues on this repository. Happy coding!

Installation

To build and install gdsclient from this repository, simply run the following command:

pip3 install .

Documentation

A minimal example of using gdsclient to connect to a Neo4j database and run GDS algorithms:

from neo4j import GraphDatabase
from gdsclient import Neo4jQueryRunner, GraphDataScience

# Set up driver and gds module
URI = "bolt://localhost:7687" # Override according to your setup
driver = GraphDatabase.driver(URI) # You might also have auth set up in your db
runner = Neo4jQueryRunner(driver)
gds = GraphDataScience(runner)

# Project your graph
graph = gds.graph.create("graph", "*", "*")

# Run the PageRank algorithm with custom configuration
gds.pageRank.write(graph, tolerance=0.5, writeProperty="pagerank")

For extensive documentation of all operations supported by GDS, please refer to the GDS Manual.

Extensive end-to-end examples in Jupyter ready-to-run notebooks can be found in the examples directory:

Acknowledgements

This work has been inspired by the great work done in the following libraries:

Contributing

The gdsclient project does not yet have contribution guidelines.

License

See LICENSE file. All content is copyright © Neo4j Sweden AB.

Comments
  • streamNodeProperty() doesn't work with gds.run_cypher() as I guess

    streamNodeProperty() doesn't work with gds.run_cypher() as I guess

    graphdatascience 1.3

    I tried a query like this:

    query = f'''
       call gds.graph.streamNodeProperty(
          'xxx',
          'xxxx',
          ['xxxxx']
       )
    yield nodeId as id, propertyValue as degree
    return id, degree limit 100
    ...
    result = gds.run_cypher(query)
    

    => KeyError: 'graph_name'

    I figured out to make it work like this:

    query = f'''
       ...
    '''
    params = {
       'graph_name': 'xxx',
       'properties': 'xxxx',
       'entities'" ['xxxxx'],
       'config': ''
    }
    result = gds.run_cypher(query, params)
    

    => No error, but it returned all rows(not limited to 100) as nodeId and propertyValue(not renamed as id and degree)

    Other cypher queries works with gds.run_cypher(query) as expected.

    opened by MOSSupport 27
  • Add IMDB data and loaders

    Add IMDB data and loaders

    Co-authored-by: Adam Schill Collberg [email protected]

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [x] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [x] Your contribution is covered by tests
    REVIEW OK - MERGE ON HOLD 
    opened by brs96 10
  • Skip defaults tests when targeting AuraDS

    Skip defaults tests when targeting AuraDS

    Since the "neo4j" user lacks auth there.

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [ ] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [ ] Your contribution is covered by tests
    opened by adamnsch 5
  • gds.degree.stream not respecting orientation?

    gds.degree.stream not respecting orientation?

    Describe the bug I'm trying to reproduce a similar result that I would get in GDS.

    To Reproduce Cypher code:

    call gds.graph.project('xdc-test-search', ['Event','Search'], { HAS_SEARCH:{orientation:'REVERSE'}})
    call gds.degree.stream('xdc-test-search')
    YIELD nodeId, score
    return gds.util.asNode(nodeId).search_name_1 as SearchTerm, score As NumberOfSearches
    Order by NumberOfSearches Descending, SearchTerm Limit 10;
    

    The above returns the Search nodes.

    Python code:

    node_projection = ["Event","Search"]
    relationship_projection = {"HAS_SEARCH": {"orientation": "REVERSE"}}
    G, _ = gds.graph.project("xdc-test-search", node_projection, relationship_projection)
    degree_stream = gds.degree.stream(G)
    

    degree.stream above is returning the node ids of the Event nodes (other side of direction), even if I were to use G = gds.graph.get("xdc-test-search") and use the projection created with cypher.

    I'm sure it's something I'm doing on my end :)

    graphdatascience library version: 1.3 GDS plugin version: 2.1.7 Python version: 3.9.12 Neo4j version: 4.4.10 Operating system: macOS 12.6

    The image below is what I would expect from the python approach. image

    opened by bSharpCyclist 4
  • Support multiple dataframe input to CE construct

    Support multiple dataframe input to CE construct

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [x] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [x] Your contribution is covered by tests
    opened by adamnsch 3
  • Expose undirected rel types in graph.construct and load_cora

    Expose undirected rel types in graph.construct and load_cora

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [x] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [x] Your contribution is covered by tests
    opened by FlorentinD 3
  • Test graph.construct with nan properties

    Test graph.construct with nan properties

    • Enable arrow test against AuraDS
    • Test NaN and null handling in graph.construct

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [x] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [x] Your contribution is covered by tests
    opened by FlorentinD 3
  • Handle gds.alpha.graph.nodeLabel.write

    Handle gds.alpha.graph.nodeLabel.write

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [ ] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [ ] Your contribution is covered by tests
    opened by vnickolov 3
  • Relationship Orientation during graph construction with dataframes

    Relationship Orientation during graph construction with dataframes

    Hi, I'd like to understand if it is possible to construct a graph from data frames without orientation so that I can have an undirected relationship. I see from the schema here https://neo4j.com/docs/graph-data-science/current/graph-project-apache-arrow/#arrow-send-relationships that source and destination are mandatory but there is no special column for orientation.

    opened by valerio-piccioni 3
  • Add mapping for `gds.alpha.triangles`

    Add mapping for `gds.alpha.triangles`

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [x] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [x] Your contribution is covered by tests
    opened by FlorentinD 3
  • Set target database within initial constructor call

    Set target database within initial constructor call

    Is your feature request related to a problem? Please describe. I would like to set the database I plan to use upon initializing my gds object GraphDataScience(URI, auth=creds, database='my-db') rather than having to call gds.set_database("my-db"). This would purely be a simple convenience to save a line of code.

    Describe the solution you would like Allow the setting of the database within the constructor call

    opened by seankrobinson 3
  • Throw better warning if construct is called multiple times

    Throw better warning if construct is called multiple times

    This improves the UX for beginners who accidentally call f.i. gds.graph.load_cora twice without cleanup.

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [x] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [x] Your contribution is covered by tests
    opened by FlorentinD 2
  • WIP: Hashgnn example notebook

    WIP: Hashgnn example notebook

    Based on #224

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [ ] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [ ] Your contribution is covered by tests
    opened by adamnsch 1
  • Add convenience methods for loading OGB graphs

    Add convenience methods for loading OGB graphs

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [ ] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [ ] Your contribution is covered by tests
    opened by adamnsch 1
  • WIP: Update query runner interface

    WIP: Update query runner interface

    Thank you for your contribution to the Graph Data Science Client project.

    Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

    Make sure:

    • [ ] You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
    • [ ] Your contribution is covered by tests
    opened by adamnsch 1
Releases(1.5)
  • 1.5(Nov 2, 2022)

    We are happy to announce the release of graphdatascience, the GDS Python client, version 1.5! It is published to PyPI!

    Changes:

    • Fixed a bug where the client could not connect to the server when the default db was set in the GraphDataScience constructor.
    • For GDS admin users, gds.graph.get is now able to resolve graph names into Graph objects of other users graph projections.
    • Add support for gds.alpha.triangles.
    • Support calling gds.alpha.userLog to access hints and warnings for the recently run operations.
    • Add support for gds.alpha.backup and gds.alpha.restore.
    • Add support for gds.alpha.config.defaults.set and gds.alpha.config.defaults.list.

    The release can be pip installed with pip install graphdatascience==1.5.

    Source code(tar.gz)
    Source code(zip)
  • 1.4(Sep 30, 2022)

    We are happy to announce the release of graphdatascience, the GDS Python client, version 1.4! It is published to PyPI!

    Highlights:

    • The DataFrame returned by gds.beta.graph.relationships.stream now has a convenience method called by_rel_type.
    • Added a new optional string parameter database to GraphDataScience.run_cypher for overriding which database to target.
    • Added new method gds.graph.load_cora to load the CORA dataset into GDS.
    • Added a new optional string parameter database to the GraphDataScience constructor for specifying the targeted database.
    • Fix resolving Node regression pipelines created via gds.alpha.pipeline.nodeRegression.create.
    • Fix resolving Node regression models created via gds.alpha.pipeline.nodeRegression.train.
    • Fix an issue where run_cypher did not execute Cypher correctly in some edge cases.

    A full list of changes can be found in the changelog.

    The release can be pip installed with pip install graphdatascience==1.4.

    Source code(tar.gz)
    Source code(zip)
  • 1.3(Aug 23, 2022)

    We are happy to announce the release of graphdatascience, the GDS Python client, version 1.3! It is published to PyPI!

    Highlights:

    • New versioning scheme using only two numbers
    • Add MLP training method addMLP to link prediction and node classification pipelines.
    • Add support for new graph catalog API endpoints in GDS >= 2.2.0.
    • Add support for random walk with restarts sampling procedure.
    • Add support for graph property endpoints in GDS >= 2.2.0.
    • Add Arrow Flight specific parameters to the GraphDataScience constructor:
      • arrow_tls_root_certs
      • arrow_disable_server_verification
    • Add support for new stream graph relationships endpoint.
    • Dropped support for Python 3.6.

    A full list of changes can be found in the changelog.

    The release can be pip installed with pip install graphdatascience==1.3.

    Source code(tar.gz)
    Source code(zip)
  • 1.3.0a1(Aug 11, 2022)

    The first alpha release of version 1.3.0 of graphdatascience, the GDS Python client, has been published to PyPI!

    Highlights:

    • Add MLP training method addMLP to link prediction and node classification pipelines.
    • Add support for new graph catalog API endpoints in GDS >= 2.2.0.
    • Add support for random walk with restarts sampling procedure.
    • Add support for graph property endpoints in GDS >= 2.2.0.
    • Add Arrow Flight specific parameters to the GraphDataScience constructor:
      • arrow_tls_root_certs
      • arrow_disable_server_verification
    • Add support for new stream graph relationships endpoint.
    • Dropped support for Python 3.6.

    A full list of changes can be found in the changelog.

    The release can be pip installed with pip install graphdatascience==1.3.0a1.

    Source code(tar.gz)
    Source code(zip)
  • 1.2.0(Jul 5, 2022)

    Version 1.2.0 of graphdatascience, the GDS Python client, has been published to PyPI!

    Included are bug fixes for:

    • The separate_property_columns=True option of gds.graph.streamNodeProperties did not handle list node properties correctly.
    • An irrelevant warning was shown when creating a GraphDataScience object targeting an AuraDS instance with GDS server version >= 2.1.0.
    • Calling gds.alpha.graph.construct targeting an AuraDS instance would raise an exception.

    The release can be pip installed with pip install graphdatascience==1.2.0.

    Source code(tar.gz)
    Source code(zip)
  • 1.1.0(Jun 9, 2022)

    Version 1.1.0 of graphdatascience, the GDS Python client, has been published to PyPI!

    Changes highlights:

    • Support for GDS library version 2.1
    • Additional and improved convenience functionality on the Graph object
    • Supporting GDS Apache Arrow capabilities for graph catalog stream procedures
    • New method gds.alpha.graph.construct for loading a graph directly into GDS from client side pandas DataFrames
      • Greatly sped up by Apache Arrow if enabled

    A full list of changes can be found in the changelog.

    The release can be pip installed with pip install graphdatascience==1.1.0.

    Source code(tar.gz)
    Source code(zip)
  • 1.1.0rc1(Jun 2, 2022)

    The first release candidate of version 1.1.0 of graphdatascience, the GDS Python client, has been published to PyPI!

    Highlights:

    • Added support for auto tuning for machine learning pipelines.
    • Added support for providing ranges as length two tuples to addLogisticRegression and addRandomForest.
    • Added support for new GDS library 2.1 signature of gds.graph.removeNodeProperties.
    • Added support for new function gds.close which calls .close() on a GraphDataScience object's underlying Neo4j driver.
    • Added new method gds.alpha.graph.construct to construct a GDS graph from pandas DataFrames. When running against a GDS library with its Apache Arrow server enabled it will be a lot faster.
    • Added support for new nodeRegression pipelines.
    • New convenience methods on the Graph object.

    A full list of changes can be found in the changelog.

    The release can be pip installed with pip install graphdatascience==1.1.0rc1.

    Source code(tar.gz)
    Source code(zip)
  • 1.1.0a2(May 19, 2022)

    The second alpha release of version 1.1.0 of graphdatascience, the GDS Python client, has been published to PyPI!

    Highlights:

    • Added support for new configureAutoTuning method on NC and LP pipelines.
    • Added support for providing ranges as length two tuples to addLogisticRegression and addRandomForest.
    • Added new method auto_tuning_config to NC and LP pipelines for querying a pipelines auto-tuning config.
    • Added support for new GDS library 2.1 signature of gds.graph.removeNodeProperties.
    • Added support for new function gds.close which calls .close() on a GraphDataScience object's underlying Neo4j driver.
    • Added new method gds.alpha.graph.construct to construct a GDS graph from pandas DataFrames, which works if the GDS Flight server is enabled.
    • Added new function gds.database which can be used to see which database is currently being targeted.
    • Added support for new nodeRegression pipelines.

    The release can be pip installed with pip install graphdatascience==1.1.0a2.

    Source code(tar.gz)
    Source code(zip)
  • 1.1.0a1(May 6, 2022)

    The alpha release of version 1.1.0 of graphdatascience, the GDS Python client, has been published to PyPI!

    Highlights:

    • Added support for new configureAutoTuning method on NC and LP pipelines.
    • Added support for providing ranges as length two tuples to addLogisticRegression and addRandomForest.
    • Added support for new function gds.close which calls .close() on a GraphDataScience object's underlying Neo4j driver.
    • Added new method gds.alpha.graph.construct to construct a GDS graph from pandas DataFrames, which works if the GDS Flight server is enabled.
    • Added new function gds.database which can be used to see which database is currently being targeted.
    • The functions gds.graph.streamNodeProperty and gds.graph.streamRelationshipProperty can leverage the Arrow Flight server of GDS to improve throughput.

    The release can be pip installed with pip install graphdatascience==1.1.0a1.

    Source code(tar.gz)
    Source code(zip)
  • 1.0.0(Mar 24, 2022)

    The first official major release, 1.0.0, of graphdatascience, the GDS Python client, has been published to PyPI!

    Highlights:

    • Replaced all dict return types with pandas Series.
    • Replaced all list[dict,...] return types with pandas DataFrame.
    • Replaced NC and LP training pipelines method configureParams by new methods addLogisticRegression and addRandomForest.
    • All procedures of the GDS Pipeline catalog are supported.
    • The NC and LP training pipelines support estimating train via a train_estimate method.
    • All ML models support estimating predict via predict_[mode]_estimate methods.
    • Removed support for GDS 1.x graph.create syntax.

    Read more in the changelog.

    The release can be pip installed with pip install graphdatascience==1.0.0.

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Feb 25, 2022)

    A new release 0.1.0 of graphdatascience, the GDS Python client, has been published at PyPI!

    Highlights:

    • When connecting to AuraDS, a specific user-agent will be set indicating that the graphdatascience client is used.
    • The methods of NCTrainingPipeline and LPTrainingPipeline for building the pipelines now return metadata from the underlying Cypher procedures called.
    • Methods creating Graph objects now additionally return the metadata from the underlying Cypher procedures called.
    • Methods creating Model objects now additionally return the metadata from the underlying Cypher procedures called.

    Read more in the changelog.

    The release can be pip installed with pip install graphdatascience==0.1.0.

    Source code(tar.gz)
    Source code(zip)
  • 0.0.9(Feb 3, 2022)

  • 0.0.8(Jan 24, 2022)

    A new release 0.0.8 of graphdatascience, which is the new and final name of the GDS Python client, formerly called gdsclient , has been published at PyPI!

    Highlights:

    • new library name!
    • new source repository (this repo)
    • support for all utility functions
    • support for all Similarity functions
    • simplified interface to construct GDS reference object (hidden driver)
    • simplified interface to run Cypher queries (hidden query runner)

    The release can be pip installed with pip install graphdatascience==0.0.8.

    Source code(tar.gz)
    Source code(zip)
Owner
Neo4j
Neo4j
LINUX-AOS (Automatic Optimization System)

LINUX-AOS (Automatic Optimization System)

1 Jul 12, 2022
An Notifier Program that Notifies you to relax your eyes Every 15 Minutes👀

Every 15 Minutes ⌛ Every 15 Minutes is an application that is used to Notify you to Relax your eyes Every 15 Minutes, This is fully made with Python a

FSP Gang s' YT 2 Oct 18, 2021
Python-Kite: Simple python code to make kite pattern

Python-Kite Simple python code to make kite pattern. Getting Started These instr

Anoint 0 Mar 22, 2022
Python script for diving image data to train test and val

dataset-division-to-train-val-test-python python script for dividing image data to train test and val If you have an image dataset in the following st

Muhammad Zeeshan 1 Nov 14, 2022
solsim is the Solana complex systems simulator. It simulates behavior of dynamical systems—DeFi protocols, DAO governance, cryptocurrencies, and more—built on the Solana blockchain

solsim is the Solana complex systems simulator. It simulates behavior of dynamical systems—DeFi protocols, DAO governance, cryptocurrencies, and more—built on the Solana blockchain

William Wolf 12 Jul 13, 2022
A simple and easy to use Python's PIP configuration manager, similar to the Arch Linux's Java manager.

PIPCONF - The PIP configuration manager If you need to manage multiple configurations containing indexes and trusted hosts for PIP, this project was m

João Paulo Carvalho 11 Nov 30, 2022
TeamFleming is a multicultural group of 20 young bioinformatics enthusiasts participating in the 2021 HackBio Virtual Summer Internship

💻 Welcome to Team Fleming's Repo! #TeamFleming is a multicultural group of 20 young bioinformatics enthusiasts participating in the 2021 HackBio Virt

3 Aug 08, 2021
Animation picker for Audodesk Maya 2017 (or higher)

Dreamwall Picker Animation picker for Audodesk Maya 2017 (or higher) Authors: Lionel Brouyère, Olivier Evers This tool is a fork of Hotbox Designer (L

DreamWall 93 Dec 21, 2022
Quick script for automatically extracting syscall numbers for an OS

Syscalls-Extractor Quick script for automatically extracting syscall numbers for an OS $ python3 .\syscalls-extractor.py --help usage: syscalls-extrac

m0rv4i 54 Feb 10, 2022
Desenvolvendo as habilidades básicas de programação visando a construção de aplicativos por meio de bibliotecas apropriadas à Ciência de Dados.

Algoritmos e Introdução à Computação Ementa: Conceitos básicos sobre algoritmos e métodos para sua construção. Tipos de dados e variáveis. Estruturas

Dyanna Cruz 1 Jan 06, 2022
A python package to adjust the bias of probabilistic forecasts/hindcasts using "Mean and Variance Adjustment" method.

Documentation A python package to adjust the bias of probabilistic forecasts/hindcasts using "Mean and Variance Adjustment" method. Read documentation

1 Feb 02, 2022
Auto check in via GitHub Actions

因为本人毕业离校,本项目交由在校的@hfut-xyc同学接手,请访问hfut-xyc/hfut_auto_check-in获得最新的脚本 本项目遵从GPLv2协定,Copyright (C) 2021, Fw[a]rd 免责声明 根据GPL协定,我、本项目的作者,不会对您使用这个脚本带来的任何后果

Fw[a]rd 3 Jun 27, 2021
The worst and slowest programming language you have ever seen

VenumLang this is a complete joke EXAMPLE: fizzbuzz in venumlang x = 0

Venum 7 Mar 12, 2022
🦋 hundun is a python library for the exploration of chaos.

hundun hundun is a python library for the exploration of chaos. Please note that this library is in beta phase. Example Import the package's equation

kosh 7 Nov 07, 2022
Additional useful operations for Python

Pyteal Extensions Additional useful operations for Python Available Operations MulDiv64: calculate m1*m2/d with no overflow on multiplication (TEAL 3+

Ulam Labs 11 Dec 14, 2022
A python script based on OpenCV-Python, you can automatically hang up the Destiny 2 Throne to get the Dawning Essence.

A python script based on OpenCV-Python, you can automatically hang up the Destiny 2 Throne to get the Dawning Essence.

1 Dec 19, 2021
Howell County, Missouri, COVID-19 data and (unofficial) estimates

COVID-19 in Howell County, Missouri This repository contains the daily data files used to generate my COVID-19 dashboard for Howell County, Missouri,

Jonathan Thornton 0 Jun 18, 2022
Python with the scientific stack, compiled to WebAssembly.

Pyodide may be used in any context where you want to run Python inside a web browser.

9.5k Jan 09, 2023
Terrible sudoku solver with spaghetti code and performance issues

SudokuSolver Terrible sudoku solver with spaghetti code and performance issues - if it's unable to figure out next step it will stop working, it never

Kamil Bizoń 1 Dec 05, 2021
Code and data for learning to search in local branching

Code and data for learning to search in local branching

Defeng Liu 7 Dec 06, 2022