Library extending Jupyter notebooks to integrate with Apache TinkerPop and RDF SPARQL.

Overview

Graph Notebook: easily query and visualize graphs

The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs.

A colorful graph picture

Visualizing Gremlin queries:

Gremlin query and graph

Visualizing openCypher queries

openCypher query and graph

Visualizing SPARQL queries:

SPARL query and graph

Instructions for connecting to the following graph databases:

Endpoint Graph model Query language
Gremlin Server property graph Gremlin
Blazegraph RDF SPARQL
Amazon Neptune property graph or RDF Gremlin or SPARQL

We encourage others to contribute configurations they find useful. There is an additional-databases folder where more information can be found.

Features

Notebook cell 'magic' extensions in the IPython 3 kernel

%%sparql - Executes a SPARQL query against your configured database endpoint.

%%gremlin - Executes a Gremlin query against your database using web sockets. The results are similar to those a Gremlin console would return.

%%opencypher or %%oc Executes an openCypher query against your database.

%%graph_notebook_config - Sets the executing notebook's database configuration to the JSON payload provided in the cell body.

%%graph_notebook_vis_options - Sets the executing notebook's vis.js options to the JSON payload provided in the cell body.

%%neptune_ml - Set of commands to integrate with NeptuneML functionality. Documentation

TIP 👉 There is syntax highlighting for %%sparql, %%gremlin and %%oc cells to help you structure your queries more easily.

Notebook line 'magic' extensions in the IPython 3 kernel

%gremlin_status - Obtain the status of Gremlin queries. Documentation

%sparql_status - Obtain the status of SPARQL queries. Documentation

%opencypher_status or %oc_status - Obtain the status of openCypher queries.

%load - Generate a form to submit a bulk loader job. Documentation

%load_ids - Get ids of bulk load jobs. Documentation

%load_status - Get the status of a provided load_id. Documentation

%neptune_ml - Set of commands to integrate with NeptuneML functionality. You can find a set of tutorial notebooks here. Documentation

%status - Check the Health Status of the configured host endpoint. Documentation

%seed - Provides a form to add data to your graph without the use of a bulk loader. Supports both RDF and Property Graph data models.

%stream_viewer - Interactively explore the Neptune CDC stream (if enabled)

%graph_notebook_config - Returns a JSON payload that contains connection information for your host.

%graph_notebook_host - Set the host endpoint to send queries to.

%graph_notebook_version - Print the version of the graph-notebook package

%graph_notebook_vis_options - Print the Vis.js options being used for rendered graphs

TIP 👉 You can list all the magics installed in the Python 3 kernel using the %lsmagic command.

TIP 👉 Many of the magic commands support a --help option in order to provide additional information.

Example notebooks

This project includes many example Jupyter notebooks. It is recommended to explore them. All of the commands and features supported by graph-notebook are explained in detail with examples within the sample notebooks. You can find them here. As this project has evolved, many new features have been added. If you are already familiar with graph-notebook but want a quick summary of new features added, a good place to start is the Air-Routes notebooks in the 02-Visualization folder.

Keeping track of new features

It is recommended to check the ChangeLog.md file periodically to keep up to date as new features are added.

Prerequisites

You will need:

  • Python 3.6.13-3.9.7
  • RDFLib 5.0.0
  • A graph database that provides one or more of:
    • A SPARQL 1.1 endpoint
    • An Apache TinkerPop Gremlin Server compatible endpoint
    • An endpoint compatible with openCypher

Installation

# pin specific versions of required dependencies
pip install rdflib==5.0.0

# install the package
pip install graph-notebook

# install and enable the visualization widget
jupyter nbextension install --py --sys-prefix graph_notebook.widgets
jupyter nbextension enable  --py --sys-prefix graph_notebook.widgets

# copy static html resources
python -m graph_notebook.static_resources.install
python -m graph_notebook.nbextensions.install

# copy premade starter notebooks
python -m graph_notebook.notebooks.install --destination ~/notebook/destination/dir  

# start jupyter
python -m graph_notebook.start_notebook --notebooks-dir ~/notebook/destination/dir

Connecting to a graph database

Gremlin Server

In a new cell in the Jupyter notebook, change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. Optionally, modify traversal_source if your graph traversal source name differs from the default value. For a local Gremlin server (HTTP or WebSockets), you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 8182,
  "ssl": false,
  "gremlin": {
    "traversal_source": "g"
  }
}

To setup a new local Gremlin Server for use with the graph notebook, check out additional-databases/gremlin server

Blazegraph

Change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. For a local Blazegraph database, you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 9999,
  "ssl": false,
  "sparql": {
    "path": "sparql"
  }
}

You can also make use of namespaces for Blazegraph by specifying the path graph-notebook should use when querying your SPARQL like below:

%%graph_notebook_config

{
  "host": "localhost",
  "port": 9999,
  "ssl": false,
  "sparql": {
    "path": "blazegraph/namespace/foo/sparql"
  }
}

This will result in the url localhost:9999/blazegraph/namespace/foo/sparql being used when executing any %%sparql magic commands.

To setup a new local Blazegraph database for use with the graph notebook, check out the Quick Start from Blazegraph.

Amazon Neptune

Change the configuration using %%graph_notebook_config and modify the defaults as they apply to your Neptune cluster:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": true,
  "aws_region": "your-neptune-region"
}

To setup a new Amazon Neptune cluster, check out the Amazon Web Services documentation.

When connecting the graph notebook to Neptune, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow this guide.

Authentication (Amazon Neptune)

If you are running a SigV4 authenticated endpoint, ensure that your configuration has auth_mode set to IAM:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "IAM",
  "load_from_s3_arn": "",
  "ssl": true,
  "aws_region": "your-neptune-region"
}

Additionally, you should have the following Amazon Web Services credentials available in a location accessible to Boto3:

  • Access Key ID
  • Secret Access Key
  • Default Region
  • Session Token (OPTIONAL. Use if you are using temporary credentials)

These variables must follow a specific naming convention, as listed in the Boto3 documentation

A list of all locations checked for Amazon Web Services credentials can also be found here.

Contributing Guidelines

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Comments
  • [BUG] Neptune_ML widget error in 2.0.9

    [BUG] Neptune_ML widget error in 2.0.9

    Describe the bug Starting in version 2.0.9 the neptune_ml widget is having an issue where the json values being passed in are getting the following error

    {'error': JSONDecodeError('Expecting value: line 1 column 1 (char 0)',)}
    

    To Reproduce Steps to reproduce the behavior:

    1. Run through the 01-Introduction-to-Node-Classification-Gremlin notebook
    2. When you get to the export step the error occurs

    Additional context This is not a problem in version 2.0.7

    bug 
    opened by bechbd 22
  • Cannot install: No module named 'graph_notebook'

    Cannot install: No module named 'graph_notebook'

    Describe the bug I cannot install your graph notebook. I receive the error No module named 'graph_notebook' when following your installation steps.

    To Reproduce Steps to reproduce the behavior:

    1. Create virtual environment with venv: python -m venv .env
    2. Activate the virtual environment (.e.g, source .env/bin/activate).
    3. Upgrade pip to 20.3.1: pip install -U pip)
    4. Install requirements: pip install -r requirements.txt
    5. Per instructions, install and enable the visualization widget: jupyter nbextension install --py --sys-prefix graph_notebook.widgets I receive the ERROR: ModuleNotFoundError: No module named 'graph_notebook'

    I have tried running the jupyter nbextension install --py --sys-prefix graph_notebook.widgets command from the src directory and I received the same error.

    Desktop (please complete the following information):

    • OS: OS X 11 (Big Sur)
    • Browser Safari
    • terminal: running zsh
    bug 
    opened by wdduncan 13
  • Documentation for specifying sparql paths on Blazegraph

    Documentation for specifying sparql paths on Blazegraph

    Does PR https://github.com/aws/graph-notebook/pull/49 fix issues #39 and #45 ? If so, can you please post documentation? I've tried:

    %%graph_notebook_config
    {
      "host": "http://kg-hub-rdf.berkeleybop.io",
      "port": 80,
      "auth_mode": "DEFAULT",
      "iam_credentials_provider_type": "ROLE",
      "load_from_s3_arn": "",
      "ssl": false,
      "aws_region": "us-east-1"
      "sparql": {
           "blazegraph/sparql"
       }
    }
    

    and

    %%graph_notebook_config
    {
      "host": "http://kg-hub-rdf.berkeleybop.io",
      "port": 80,
      "auth_mode": "DEFAULT",
      "iam_credentials_provider_type": "ROLE",
      "load_from_s3_arn": "",
      "ssl": false,
      "aws_region": "us-east-1"
      "sparql_path": "blazegraph/sparql"
    }
    

    But I receive syntax errors.

    opened by wdduncan 12
  • [BUG] Some gremlin queries not generating graphs in Air-Routes-Gremlin.ipynb

    [BUG] Some gremlin queries not generating graphs in Air-Routes-Gremlin.ipynb

    Describe the bug Several of the cells in the Air-Routes-Gremlin.ipynb do not generate results in the graph tab.

    To Reproduce Steps to reproduce the behavior:

    1. Go to Air-Routes-Gremlin.ipynb
    2. Scroll down to the text "The next query also produces a result that is fun to explore using the Graph tab"
    3. Run the "my_node_labels" cell
    4. Run the gremlin query cell
    5. There is only a Console and Query Metadata tabs.

    Expected behavior A graph tab with interesting results.

    Screenshots image

    Desktop (please complete the following information):

    • OS: Ubuntu 20.04
    • Browser: Chrome
    • Version: 97.0.4692.99

    Additional context Latest version of graph-notebook 3.1.1 Backend gremlin-server using instructions here. Seeded with %seed in notebook.

    bug 
    opened by holleyism 10
  • [BUG] SPARQL load error due to lack of escaping apostrophe '

    [BUG] SPARQL load error due to lack of escaping apostrophe '

    Describe the bug on Jupyter notebook from the Sagemaker notebook instance launched from Neptune Console UI. On notebook /Neptune/02-Visualization/Air-Routes-SPARQ.ipynb, section "Let's load some RDF Data". After choosing SPARQL and Airport from the drop down list and clicking the submit button, got following error:

    Loading data set airports with language sparql
    1/3:	0_nodes.txt
    {
      "requestId": "0cbc40eb-51e1-eb7a-2b7f-d3544414d259",
      "code": "MalformedQueryException",
      "detailedMessage": "Malformed query: Lexical error at line 261, column 38.  Encountered: \" \" (32), after : \"Hare\""
    }
    

    I suspect that it is the processing of "O'Hare Airport" causing the error when the data loader can't handle unescaped Apostrophe "'". The same error also occurred in the EPL-SPARQL notebook when loading the Football data set where "St. Mary's Park" and "St. James Park" caused same error message. After modifying the query to triple quote """St. Mary's Park", the load worked.

    bug 
    opened by xiaokunx 8
  • [BUG] Graph visualization does not support multivalue properties

    [BUG] Graph visualization does not support multivalue properties

    Graph visualization does not support multivalue properties

    Steps to reproduce the behavior:

    1. Set up a graph where vertices have multivalue properties (i.e. have set cardinality)
    2. query g.V().outE().inV().path().by(elementMap()). This works but you cannot see all values of the multivalue properties.
    3. If you change the query to use valueMap g.V().outE().inV().path().by(valueMap()), the visualization does not render properly. Some vertices are drawn but they do not represent the graph.

    Expected behavior Graph is visualized correctly even when valueMap() is used and multivalue properties can be viewed in the visualization "Details" box

    Screenshots This is how the graph looks when using elementMap() WORKS-using-elementMap

    This is how the graph looks when I use valueMap() BROKEN-using-valueMap

    Desktop (please complete the following information):

    • OS: macOS 12.6
    • Browser: Chrome 105.0.5195.125
    • Version: graph-notebook 3.6.0
    bug 
    opened by FsecureSamiTikka 6
  • Identity Graph ETL notebook

    Identity Graph ETL notebook

    Issue #, if available: N/A

    Description of changes:

    • Add a new Identity Graph sample notebook demonstrating how to set up an AWS Glue based ETL pipeline.
    • Added utility library to setup the ETL pipeline
    • Added AWS Glue ETL scripts

    By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

    opened by abhishekpradeepmishra 6
  • [BUG] Cannot issue Gremlin queries

    [BUG] Cannot issue Gremlin queries

    Describe the bug Using the steps described in the setup results in an error:

    {
    'error': GremlinServerError
      (
        '597: No signature of method: org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph.addV() is applicable for argument types: (String) values: [person]
        Possible solutions: any(), any(groovy.lang.Closure), wait(), open(), tx(), find()'
      )
    }
    

    To Reproduce Steps to reproduce the behavior:

    1. Download the Gremlin Server from https://tinkerpop.apache.org/ and unzip it. The remaining steps in this section assume you have made your working directory the place where you performed the unzip.
    2. In conf/tinkergraph-empty.properties, change the ID manager from LONG to ANY to enable IDs that include text strings.
      gremlin.tinkergraph.vertexIdManager=ANY
      
    3. Optionally add another line doing the same for edge IDs.
      gremlin.tinkergraph.edgeIdManager=ANY
      
      
    4. To enable HTTP as well as Web Socket connections to the Gremlin Server, edit the file /conf/gremlin-server.yaml and change
      channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
      

      to

       channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
      

      This will allow you to access the Gremlin Server from Jupyter using commands like curl as well as using the %%gremlin cell magic. This step is optional if you do not need HTTP connectivity to the server.

    5. Start the Gremlin server bin/gremlin-server.sh start

    Connecting to a local Gremlin Server from Jupyter:

    1. In the Jupyter Notebook disable SSL using %%graph_notebook_config and change the host to localhost. Keep the other defaults even though they are not used for configuring the Gremlin Server.
    %%graph_notebook_config
    {
      "host": "localhost",
      "port": 8182,
      "ssl": false,
      "gremlin": {
        "traversal_source": "g",
        "username": "",
        "password": "",
        "message_serializer": "graphsonv3"
      }
    }
    

    If the Gremlin Server you wish to connect to is remote, replacing localhost with the IP address or DNS of the remote server should work. This assumes you have access to that server from your local machine.

    1. Validate connection.
    %status
    
    1. Issue query (from here)
    %%gremlin
    
    g.addV('person').property('name', 'dan')
     .addV('person').property('name', 'mike')
     .addV('person').property('name', 'saikiran')
    

    Expected behavior The vertices should be added with no error messages.

    Desktop (please complete the following information):

    • Gremlin Server version: 3.6.1
    • JupyterLab version: 3.5.1
    • OS: Windows 10 / Debian Bullseye
    • Browser: Firefox, Chrome
    bug 
    opened by whitehorsesoft 5
  • [BUG] Cell magic not found

    [BUG] Cell magic not found

    Cell magic not found Cell magic such as %%graph_notebook_config and %%status not found.

    To Reproduce Steps to reproduce the behavior:

    1. Run docker image jupyter/minimal-notebook.
    2. Open in browser
    3. Verify normal notebook functionality works without graph-notebook.
    4. Install graph-notebook using the commands found here:
    # pin specific versions of required dependencies
    pip install rdflib==5.0.0
    
    # install the package
    pip install graph-notebook
    
    1. Attempt to change configuration according to directions here:
    %%graph_notebook_config
    {
      "host": "localhost",
      "port": 8182,
      "ssl": false,
      "gremlin": {
        "traversal_source": "g",
        "username": "",
        "password": "",
        "message_serializer": "graphsonv3"
      }
    }
    
    1. Error occurs: "UsageError: Cell magic %%graph_notebook_config not found."
    2. Verify error happens even after restarting kernel and restarting docker container. Expected behavior The cell magic should work as expected.

    Desktop (please complete the following information):

    • OS: Win10/WSL/Ubuntu
    • Browser Chrome
    • Version latest
    bug 
    opened by whitehorsesoft 5
  • Support for virtuoso sparql endpoint

    Support for virtuoso sparql endpoint

    Is your feature request related to a problem? Please describe. I've been trying to have graph-notebook connect to a virtuoso sparql endpoint, without success.

    Describe the solution you'd like Support for virtuoso sparql endpoints. Or, if already possible, documentation about connection setup.

    question 
    opened by Jefwillems 5
  • Error displaying widget

    Error displaying widget

    Describe the bug When I run tutorial notebooks and the result should be visualized, it gives me an error: "Error displaying widget". For queries without a path, it gives tabs "Console" and "Query Metadata" without any problems.

    To Reproduce I run it on JupyterLab 3.2.8 (I tried it with 3.4.2 but the result was the same). Python version 3.9.7

    Expected behavior I would love to see the results from the queries. Any idea of what could help will be appreciated.

    Screenshot Screenshot 2022-05-23 at 16 59 26

    bug needs information 
    opened by anezkakot 4
  • Sizing of edges based on a property

    Sizing of edges based on a property

    Is your feature request related to a problem? Please describe. I am building a graph with AWS Neptune which vertices are geolocated points. One property of the edges is the distance between endpoint vertices.

    Describe the solution you'd like It would be great that the edges use this property to set their distance proportionally, that way, the vertices would be self-distributed as in a map.

    opened by AlbertoRodriguezSerrano 1
  • Truncate query request time in metadata

    Truncate query request time in metadata

    Currently the query metadata reports a query request time in ms with a large number of decimal places such as:

    Request execution time (ms) | 189.0205078125
    

    Given the resolution of statement execution and network delays, etc, not sure that number of decimals points does anything but make the results harder to read :)

    Would suggest we truncate it down to a reasonable number or round it to the closest ms?

    opened by jklap 0
  • [BUG] Full screen Visualization does not work in Safari

    [BUG] Full screen Visualization does not work in Safari

    Describe the bug The full screen Visualization button does not work in Safari but does work just fine in Chrome.

    To Reproduce Steps to reproduce the behavior:

    1. Start Safari & open JupyterLab w/graph-notebook installed
    2. Run a query that would produce a Visualization such as "A simple example" in 02-Visualization/Air-Routes-Gremlin.ipynb
    3. Click the "Fullscreen" button
    4. Observe nothing happening

    Expected behavior The Visualization expands to full screen

    Desktop (please complete the following information):

    • OS: Mac Ventura 13.0.1
    • Browser: Safari
    • Version: 16.1

    Additional context Using latest graph-notebook release w/JupyterLab 3.5.2

    The issue looks to be in widgets/src/force_widget.ts in toggleExpand(). It uses document.fullscreenElement which exists in Chrome but not Safari as Safari uses document.webkitFullscreenElement.

    https://github.com/sindresorhus/screenfull shows a cross-browser approach but it should also be pretty easy to implement directly using something like:

        const elementFunc = document.documentElement as HTMLElement & {
          mozRequestFullScreen(): Promise<void>;
          webkitRequestFullscreen(): Promise<void>;
          msRequestFullscreen(): Promise<void>;
        };
    
        const docFunc = document as Document & {
          mozCancelFullScreen(): Promise<void>;
          webkitExitFullscreen(): Promise<void>;
          msExitFullscreen(): Promise<void>;
          webkitFullscreenElement(): Promise<void>;
        };
    
        const fullScreenElement =
          document.fullscreenElement ||
          document.webkitFullscreenElement ||
          document.mozFullScreenElement ||
          document.msFullscreenElement;
        const requestFullscreen =
          elementFunc.requestFullscreen ||
          elementFunc.mozRequestFullScreen ||
          elementFunc.webkitRequestFullscreen ||
          elementFunc.msRequestFullscreen;
        const exitFullscreen =
          docFunc.exitFullscreen ||
          docFunc.webkitExitFullscreen ||
          docFunc.msExitFullscreen ||
          docFunc.mozCancelFullScreen;
    

    And then replace:

    • document.fullscreenElement usage with fullscreenElement
    • elem.requestFullscreen with requestFullscreen
    • elem.requestFullscreen() with requestFullscreen.call(elem)
    • document.exitFullscreen with exitFullScreen
    • document.exitFullscreen() with exitFullscreen.call(document)

    ie should end up looking something like this snippet:

    ...
        if ( !fullScreenElement ) {
          if ( requestFullscreen ) {
            document.addEventListener("fullscreenchange", fullscreenchange);
            requestFullscreen.call(elem);
            this.canvasDiv.style.height = "100%";
          }
        } else {
    ...
    

    See here for the basis for the above code: https://stackoverflow.com/questions/54242775/angular-7-how-does-work-the-html5-fullscreen-api-ive-a-lot-of-errors

    bug 
    opened by jklap 0
  • adding fraud detection with inductive inference notebook

    adding fraud detection with inductive inference notebook

    Issue #, if available:

    Description of changes:

    Adding a notebook demonstrating real-time inductive inference on a fraud detection use case By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

    opened by sojiadeshina 0
  • Add ECR publish workflow

    Add ECR publish workflow

    Issue #, if available: N/A

    Description of changes:

    • Adding GitHub action to build the graph-notebook Docker image and publish it to ECR on new commits. This workflow currently only publishes to a private ECR repository; official ECR Public repo will be made available at a later date.

    By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

    opened by michaelnchin 0
Releases(v3.7.0)
  • v3.7.0(Dec 7, 2022)

    • Added Neo4J section to %%graph_notebook_config (Link to PR)
    • Added custom Gremlin authentication and serializer support (Link to PR)
    • Added %statistics magic for Neptune DFE engine (Link to PR)
    • Added option to disable TLS certificate verification in %%graph_notebook_config (Link to PR)
    • Improved %load status output, fixed region option (Link to PR)
    • Updated 01-About-the-Neptune-Notebook for openCypher (Link to PR)
    • Fixed results not being displayed for SPARQL ASK queries (Link to PR)
    • Fixed %seed failing to load SPARQL EPL dataset (Link to PR)
    • Fixed %db_reset status output not displaying in JupyterLab (Link to PR)
    • Fixed %%gremlin throwing error for result sets with multiple datatypes Link to PR)
    • Fixed edge label creation in 02-Using-Gremlin-to-Access-the-Graph (Link to PR)
    • Fixed igraph command error in 02-Logistics-Analysis-using-a-Transportation-Network (Link to PR)
    • Bumped typescript to 4.1.x in graph_notebook_widgets (Link to PR)
    • Pinned ipywidgets==7.7.2 and jupyterlab_widgets<3 (Link to PR)
    • Pinned nbclient<=0.7.0 (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.6.2(Oct 18, 2022)

    • New Sample Applications - Security Graphs notebooks (Link to PR)
      • Path: 03-Sample-Applications > 04-Security-Graphs
    • Update sample notebooks with parallel, same-direction edges example (Link to PR)
    • Fixed a Gremlin widgets error caused by empty individual results (Link to PR)
    • Fixed %db_reset timeout handling, made timeout limit configurable (Link to PR)
    • Fixed Sparql visualizations occasionally failing with VisJS group assignment error (Link to PR)
    • Fixed start jupyterlab command in README (Link to PR)
    • Fixed interface rendering issue in classic notebooks (Link to PR)
    • Added --hide-index option for query results (Link to PR)
    • Added result media type selection for SPARQL queries (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.6.0(Sep 15, 2022)

    • New Language Tutorials - SPARQL Basics notebook (Link to PR)
      • Path: 06-Language-Tutorials > 01-SPARQL > 01-SPARQL-Basics
    • New Neptune ML - Text Encoding Tutorial notebook (Link to PR)
      • Path: 04-Machine-Learning > Sample-Applications > 02-Job-Recommendation-Text-Encoding.ipynb
    • Added --store-to option to %%graph_notebook_config (Link to PR)
    • Added loader status details options to %load_ids (Link to PR)
    • Added --all-in-queue option to %cancel_load (Link to PR)
    • Deprecated Python 3.6 support (Link to PR)
    • Added support for literal property values in Sparql visualization options (Link to PR)
    • Various results table improvements (Link to PR)
    • Disabled automatic collapsing of large explain results (Link to PR)
    • Fixed version-specific steps in SageMaker installation script (Link to PR)
    • Added new SageMaker installation script for China regions (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.5.3(Jul 26, 2022)

    • Docker support. Docker image can be built using the command docker build . and through Docker's buildx, this can support non-x86 CPU Architectures like ARM. (Link to PR)
      • Fix service.sh conditional checks, SSL parameter can now be changed. Fix permissions error on service.sh experienced by some users. (Link to PR)
    • Added %%neptune_config_allowlist magic (Link to PR)
    • Added check to remove whitespace in %graph_notebook_config host fields (Link to PR)
    • Added silent output option to additional magics (Link to PR)
    • Fixed %sparql_status magic to return query status without query ID (Link to PR)
    • Fixed incorrect Gremlin query --store-to output (Link to PR)
    • Fixed certain characters not displaying correctly in results table (Link to PR)
    • Fixed extra index column displaying in Gremlin results table on older Pandas versions (Link to PR)
    • Reverted Gremlin console tab to single results column (Link to PR)
    • Bumped jquery-ui from 1.13.1 to 1.13.2 ((Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.5.1(Jul 13, 2022)

    • Improved the %stream_viewer magic to show the commit timestamp and isLastOp information, if available. Also added additional hover (help) text to the stream viewer. (Link to PR)
    • Added --max-content-length option to %%gremlin (Link to PR)
    • Added proxy_host and proxy_port options to the %%graph_notebook_config options. (Link to PR)
      • This allows for proxied connections to your Neptune instance from outside your VPC. Supporting the patterns seen here.
    • Fixed results table formatting in JupyterLab (Link to PR)
    • Fixed several typos in the Neptune ML 00 notebook (Link to PR)
    • Renamed the Knowledge Graph application notebooks for clarity (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.4.1(Jun 7, 2022)

    • Identity Graph - ETL notebook (Link to PR)
      • Path: 03-Identity-Graphs>03-Jumpstart-Identity-Graphs-Using-Canonical-Model-and-ETL
      • Files: scripts/, glue_utils.py and 3-Identity-Graphs>03-Jumpstart-Identity-Graphs-Using-Canonical-Model-and-ETL notebook
    • Support variable injection in %%graph_notebook_config magic (Link to PR)
    • Added three notebooks to show data science workflows with Amazon Neptune (Link to PR)
    • Added JupyterLab startup script to auto-load magics extensions (Link to PR)
    • Added includeWaiting option to %oc_status, fix same for %gremlin_status (Link to PR)
    • Added --store-to option to %status (Link to PR)
    • Fixed handling of empty nodes returned from openCypher DELETE queries (Link to PR)
    • Fixed rendering of openCypher widgets for empty result sets (Link to PR)
    • Fixed graph search overriding physics setting (Link to PR)
    • Fixed browser-specific bug in results pagination options menu (Link to PR)
    • Fixed invalid queries in Gremlin sample notebooks (Link to PR)
    • Removed requests-aws4auth requirement (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.3.0(Mar 29, 2022)

  • v3.2.0(Feb 26, 2022)

    • Added new notebooks: guides for using SPARQL and RDF with Neptune ML (Link to PR)
    • Added the ability to run explain plans to openCypher queries via %%oc explain. (Link to PR)
    • Added the ability to download the explain/profile plans for openCypher/Gremlin/SPARQL. (Link to PR)
    • Changed the %stream_viewer magic to use PropertyGraph and RDF as the stream types. This better aligns with Gremlin and openCypher sharing the PropertyGraph stream. (Link to PR)
    • Updated the airports property graph seed files to the latest level and suffixed all doubles with 'd'. (Link to PR)
    • Added grouping by depth for Gremlin and openCypher queries (PR #1)(PR #2)
    • Added grouping by raw node results (Link to PR)
    • Added --no-scroll option for disabling truncation of query result pages (Link to PR)
    • Added --results-per-page option (Link to PR)
    • Added relaxed seed command error handling (Link to PR)
    • Renamed Gremlin profile query options for clarity (Link to PR)
    • Suppressed default root logger error output (Link to PR)
    • Fixed Gremlin visualizer bug with handling non-string node IDs (Link to PR)
    • Fixed error in openCypher Bolt query metadata output (Link to PR)
    • Fixed handling of Decimal type properties when rendering Gremlin query results (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.1.1(Dec 22, 2021)

    • Added new dataset for DiningByFriends, and associated notebook (Link to PR)
    • Added new Neptune ML Sample Application for People Analytics (Link to PR)
    • Added graph customization support for SPARQL queries (Link to PR)
    • Added graph reset and search refinement buttons to the graph output tab (Link to PR)
    • Added support for setting custom edge and node tooltips (Link to PR)
    • Added edge tooltips, and options for specifying edge label length (Link to PR)
    • Updated NeptuneML pre-trained model resources for CN regions (Link to PR)
    • Fixed inaccurate help message being displayed for certain GremlinServerErrors (Link to PR)
    • Fixed error causing query autocompletion to fail (Link to PR)
    • Fixed Jupyter start script for cases where the nbconfig directory is missing (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.0.8(Nov 4, 2021)

  • v3.0.7(Oct 25, 2021)

    • Added full support for NeptuneML API command parameters to %neptune_ml (Link to PR)
    • Allow %%neptune_ml to accept JSON blob as parameter input for most phases (Link to PR)
    • Added --silent option for suppressing query output (PR #1) (PR #2)
    • Added all parserConfiguration options to %load (Link to PR)
    • Upgraded to Gremlin-Python 3.5 and Jupyter Notebook 6.x (Link to PR)
    • Resolved smart indent bug in openCypher magic cells (Link to PR)
    • Removed default /sparql path suffix from non-Neptune SPARQL requests (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.0.6(Sep 21, 2021)

    • Added a new %stream_viewer magic that allows interactive exploration of the Neptune CDC stream (if enabled). (Link to PR)
    • Added support for multi-property values in vertex and edge labels (Link to PR)
    • Added new visualization physics options, toggle button (Link to PR)
    • Fixed TypeError thrown for certain OC list type results (Link to PR
    • Documentation fixes for additional databases (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.0.5(Aug 28, 2021)

  • v3.0.3(Aug 11, 2021)

    • Gremlin visualization bugfixes (PR #1) (PR #2) (PR #3)
    • Updated the airport data loadable via %seed to the latest version (Link to PR)
    • Added support for Gremlin Profile API parameters (Link to PR)
    • Improved %seed so that the progress bar is seen to complete (Link to PR)
    • Added helper functions to neptune_ml utils to get node embeddings, model predictions and performance metrics (Link to PR)
    • Changed visualization behavior to add all group-less nodes to a default group (Link to PR)
    • Fixed a bug causing ML Export requests to fail (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v3.0.2(Jul 30, 2021)

  • v3.0.1(Jul 29, 2021)

    openCypher Support:

    With the release of support for the openCypher query language in Amazon Neptune's lab mode, graph-notebook can now be used to execute and visualize openCypher queries with any compatible graph database.

    Two new magic commands have been added:

    • %%oc/%%opencypher
    • %%oc_status/%%opencypher_status

    These openCypher magic commands inherit the majority of the query and visualization customization features that are already available in the Gremlin and SPARQL magics.

    For more detailed information and examples of how you can execute and visualize openCypher queries through graph-notebook, please refer to the new Air-Routes-openCypher and EPL-openCypher sample notebooks.

    (Link to PR)

    Other Updates:

    • Added visualization support for elementMap Gremlin step (Link to PR)
    • Added support for additional customization of edge node labels in Gremlin (Link to PR)
    • Refactored %load form display code for flexibility; fixes some descriptions being cut off (Link to PR)
    • Overhauled Gremlin visualization notebooks with example usage of new customization options and elementMap step (Link to PR)
    • Updated Neptune ML notebooks, utils, and pretrained models config (Link to PR)
    • Added support for modeltransform commands in %neptune_ml (Link to PR)
    • Included index operations metrics in metadata results tab for Gremlin Profile queries(Link to PR)
    • Added new notebook to explain Identity Graph data modeling (Link to PR)
    • Various bugfixes and documentation updates
    Source code(tar.gz)
    Source code(zip)
  • v2.1.4(Jun 27, 2021)

  • v2.1.3(Jun 22, 2021)

  • v2.1.2(May 11, 2021)

  • v2.1.1(Apr 23, 2021)

  • v2.1.0(Apr 16, 2021)

    • Add support for Mode, queueRequest, and Dependencies parameters when running %load command (Link to PR)
    • Add support for list and dict as map keys in Python Gremlin (Link to PR)
    • Refactor modules that call to Neptune or other SPARQL/Gremlin endpoints to use a unified client object (Link to PR)
    • Added an additional notebook under 02-Visualization demonstrating how to use the visualzation grouping and coloring options in Gremlin. (Link to PR)
    • Add metadata output tab for magic queries (Link to PR)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.12(Mar 25, 2021)

  • v2.0.10(Mar 18, 2021)

  • v2.0.9(Mar 3, 2021)

  • v2.0.7(Feb 1, 2021)

    • Added What’s Next sections to 01-Getting-Started notebooks to point users to next suggested notebook tutorials after finishing one notebook.
    Source code(tar.gz)
    Source code(zip)
  • v2.0.6(Jan 28, 2021)

  • v2.0.5(Jan 8, 2021)

    Gremlin Visualization

    • Enhanced Gremlin Visualization output to group vertices and color-code them based on groups. When not specified it will group by the label (if it exists). You can also specify the property to groupby using the switch --groupby or -g followed by the property name
    • Added the functionality to sort the values in the details box by key
    • Updated Air-Routes-Visualization notebook to discuss the group by functionality

    NeptuneML

    • Add tutorial notebooks for NeptuneML functionality
    Source code(tar.gz)
    Source code(zip)
  • v2.0.3(Dec 29, 2020)

    • Integration with NeptuneML feature set in AWS Neptune
    • Add helper library to perform Sigv4 signing for %neptune_ml export ..., we will move our other signing at a later date.
    • Swap how credentials are obtained for ROLE iam credentials provider such that it uses a botocore session now instead of calling the ec2 metadata service. This should make the module more usable outside of Sagemaker.
    • Add sub-configuration for sparql to allow specifying path to sparql endpoint

    New Line magics:

    • %neptune_ml export status
    • %neptune_ml dataprocessing start
    • %neptune_ml dataprocessing status
    • %neptune_ml training start
    • %neptune_ml training status
    • %neptune_ml endpoint create
    • %neptune_ml endpoint status

    New Cell magics:

    • %%neptune_ml export start
    • %%neptune_ml dataprocessing start
    • %%neptune_ml training start
    • %%neptune_ml endpoint create

    NOTE: If a cell magic is used, its line inputs for specifying parts of the command will be ignore such as --job-id as a line-param.

    Inject variable as cell input: Currently this will only work for our new cell magic commands details above. You can now specify a variable to use as the cell input received by our neptune_ml magics using the syntax ${var_name}. For example...

    # in one notebook cell:
    foo = {'foo', 'bar'}
    
    # in another notebook cell:
    %%neptune_ml export start
    
    ${foo}
    

    NOTE: The above will only work if it is the sole content of the cell body. You cannot inline multiple variables at this time.

    Source code(tar.gz)
    Source code(zip)
  • v2.0.1(Nov 24, 2020)

  • v2.0.0(Nov 20, 2020)

    • Add support for storing query results to a variable for use in other notebook cells
    • Remove %query_mode magic in favor of query parameterization
    Source code(tar.gz)
    Source code(zip)
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand Introduction We propose a generalization of leaderboards, bidimensional leader

4 Dec 03, 2022
Discover hidden deepweb pages

DeepWeb Scapper Att: Demo version An simple script to scrappe deepweb to find pages. Will return if any of those exists and will save on a file. You s

Héber Júlio 77 Oct 02, 2022
This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX

The goal of Project CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques.

International Business Machines 1.2k Jan 04, 2023
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 08, 2022
Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

Thomas Frerix 40 Dec 17, 2022
Image classification for projects and researches

This is a tool to help you quickly solve classification problems including: data analysis, training, report results and model explanation.

Nguyễn Trường Lâu 2 Dec 27, 2021
PyTorch implementations of neural network models for keyword spotting

Honk: CNNs for Keyword Spotting Honk is a PyTorch reimplementation of Google's TensorFlow convolutional neural networks for keyword spotting, which ac

Castorini 475 Dec 15, 2022
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

PyTorch implementation of [1611.06440 Pruning Convolutional Neural Networks for Resource Efficient Inference] This demonstrates pruning a VGG16 based

Jacob Gildenblat 836 Dec 26, 2022
Python SDK for building, training, and deploying ML models

Overview of Kubeflow Fairing Kubeflow Fairing is a Python package that streamlines the process of building, training, and deploying machine learning (

Kubeflow 325 Dec 13, 2022
A general-purpose encoder-decoder framework for Tensorflow

READ THE DOCUMENTATION CONTRIBUTING A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summariz

Google 5.5k Jan 07, 2023
SVG Icon processing tool for C++

BAWR This is a tool to automate the icons generation from sets of svg files into fonts and atlases. The main purpose of this tool is to add it to the

Frank David Martínez M 66 Dec 14, 2022
This repository provides an efficient PyTorch-based library for training deep models.

s3sec Test AWS S3 buckets for read/write/delete access This tool was developed to quickly test a list of s3 buckets for public read, write and delete

Bytedance Inc. 123 Jan 05, 2023
(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

Energy-based Latent Aligner for Incremental Learning Accepted to CVPR 2022 We illustrate an Incremental Learning model trained on a continuum of tasks

Joseph K J 37 Jan 03, 2023
Repository for the COLING 2020 paper "Explainable Automated Fact-Checking: A Survey."

Explainable Fact Checking: A Survey This repository and the accompanying webpage contain resources for the paper "Explainable Fact Checking: A Survey"

Neema Kotonya 42 Nov 17, 2022
A library that allows for inference on probabilistic models

Bean Machine Overview Bean Machine is a probabilistic programming language for inference over statistical models written in the Python language using

Meta Research 234 Dec 29, 2022
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

AugMix Introduction We propose AugMix, a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented ima

Google Research 876 Dec 17, 2022
An Open-Source Package for Information Retrieval.

OpenMatch An Open-Source Package for Information Retrieval. 😃 What's New Top Spot on TREC-COVID Challenge (May 2020, Round2) The twin goals of the ch

THUNLP 439 Dec 27, 2022
(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) 👚 [Paper] 👖 [Webpage] 👗 [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

Aiyu Cui 277 Dec 28, 2022
The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue.

The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue. How do I cite D-REX? For now, cite

Alon Albalak 6 Mar 31, 2022
[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

EarlyBERT This is the official implementation for the paper in ACL-IJCNLP 2021 "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by

VITA 13 May 11, 2022