Orchest is a browser based IDE for Data Science.

Overview


WebsiteDocsQuickstartVideo tutorials


Join us on Slack

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well as on a large scale cloud cluster.

orchest-0.3.0-demo

A preview of creating pipelines in Orchest. Watch the full video to learn more.

Features

For a complete list of Orchest's features, check out the overview in our docs!

  • Visually construct pipelines.
  • Run any subset of a pipeline directly or on a cron-like schedule.
  • Parametrize your data science pipelines to try out different modeling ideas.
  • Easily define your custom runtime environment that runs on any machine.

Who should use Orchest?

  • Data Scientists who want to rapidly prototype.
  • Data Scientists who like to work in Notebooks.
  • Data Scientists who are looking to create pipelines through a visual interface instead of YAML.

Installation

NOTE: Orchest is in alpha.

For GPU support, language dependencies other than Python, and other installation methods, such as building from source, please refer to our installation docs.

Requirements

  • Docker

If you do not yet have Docker installed, please visit https://docs.docker.com/get-docker/.

NOTE: On Windows, Docker has to be configured to use WSL 2. Make sure to clone Orchest inside the Linux environment. For more info and installation steps for Docker with WSL 2 backend, please visit https://docs.docker.com/docker-for-windows/wsl/.

Linux, macOS and Windows

git clone https://github.com/orchest/orchest.git && cd orchest
./orchest install

# Verify the installation.
./orchest --help

# Start Orchest.
./orchest start

Now that you have installed Orchest, get started with our quickstart tutorial, check out pipelines made by your fellow users, or have a look at our knowledge base videos explaining and showing some of Orchest's core concepts.

License

The software in this repository is licensed as follows:

  • All content residing under the "orchest-sdk/" directory of this repository is licensed under the "Apache-2.0" license as defined in "orchest-sdk/LICENSE".
  • Content outside of the above mentioned directory is available under the "AGPL-3.0" license.

We love your feedback

We would love to hear what you think and add features based on your ideas. Come chat with us on our Slack Channel or open an issue on GitHub.

Contributing

Contributions are more than welcome! Please see our contributor guides for more details.

Not sure where to start? Book a free, no-pressure pairing session with one of our core contributors.

Contributors

Comments
  • Support `containerd` as the container runtime

    Support `containerd` as the container runtime

    Description

    This PR enables working with containerd runtime by introducing an init container to pull images, the controller detects the runtime and configures orchest accordingly”

    In order to test this PR you need to have a cluster with contained runtime, microk8s is suggested, after microk8s is installed (here), the following addons need to be enabled.

    microk8s enable hostpath-storage \
      && microk8s enable dns \
      && microk8s enable ingress
    

    In order to be able to push images to microk8s node, after rebuilding all the images with some valid tag (for example v2022.06.4), you can save them to a tar file via following command:

    docker save $(docker images | awk '{if ($1 ~ /^orchest\//) new_var=sprintf("%s:%s", $1, $2); print new_var}' | grep v2022.06.4 | sort | uniq) -o orchest-images.tar
    

    Then this tar file can be shipped to microk8s node via scp. scp ./orchest-images.tar {your_user}@${microk8s node ip}:~/

    then inside the microk8s node, you can import the images via following command (note ctr has to be installed, binaries can be found here)

    sudo ctr -n k8s.io -a /var/snap/microk8s/common/run/containerd.sock i import orchest-images.tar
    
    # Or use microk8s ctr
    microk8s ctr --namespace k8s.io --address /var/snap/microk8s/common/run/containerd.sock image import orchest-images.tar
    

    then orchest can be installed with orchest-cli, with following command

    orchest install --socket-path=/var/snap/microk8s/common/run/containerd.sock --dev
    

    Note:

    the manifests must be generated via make manifestgen in the orchest-controller directory.

    TAGNAME=v2022.06.4 make -C ./services/orchest-controller manifestgen
    

    Checklist

    • [x] The documentation reflects the changes.
    • [x] The PR branch is set up to merge into dev instead of master.
    • [x] I haven't introduced breaking changes that would disrupt existing jobs, i.e. backwards compatibility is maintained.
    • [x] In case I changed the dependencies in any requirements.in I have run pip-compile to update the corresponding requirements.txt.
    • [x] In case I changed one of the services' models.py I have performed the appropriate database migrations (refer to the DB migration docs).
    • [x] In case I changed code in the orchest-sdk I followed its release checklist.
    • [x] In case I changed code in the orchest-cli I followed its release checklist.
    • [x] The newly added image-puller has to be pushed to DockerHub on release. So they need to be added to the correct .github/workflow/... file.
    • [x] add document about installing in microk8s
    • [x] merge both cli init containers into one
    • [x] ~~update thirdparties on update~~
    new feature request 
    opened by nhaghighat 32
  • Error attempting to connect to Gateway server url 'http://jupyter-EG-93c7d122-a3b1-435f-d8f6a2d0-6584-4c1c:8888'.  Ensure gateway url is valid and the Gateway instance is running.

    Error attempting to connect to Gateway server url 'http://jupyter-EG-93c7d122-a3b1-435f-d8f6a2d0-6584-4c1c:8888'. Ensure gateway url is valid and the Gateway instance is running.

    Describe the bug
    When we open pipeline in jupyterlab and run cell it fail to execute cell and throws following error.

    Error attempting to connect to Gateway server url 'http://jupyter-EG-93c7d122-a3b1-435f-d8f6a2d0-6584-4c1c:8888'. Ensure gateway url is valid and the Gateway instance is running.

    To Reproduce

    Create new project -> new pipeline -> open in jupyter notebook

    Screenshots
    image

    Environment

    • elementary os (linux)
    bug 
    opened by Practcdi 30
  • [Feature] File Manager

    [Feature] File Manager

    Description

    Add File Manager in Pipeline Editor as the default way of managing files in Orchest.

    Fixes: #612

    Checklist

    • [x] The PR branch is set up to merge into dev instead of master.
    new feature request 
    opened by iannbing 27
  • Sending notifications when jobs fail.

    Sending notifications when jobs fail.

    Description

    This PR exposes the BE functionality of sending notifications to desired channels (e.g. Slack) when jobs fail.

    Closes: #120

    Todo

    • [x] Items from https://github.com/orchest/orchest/pull/1008#issuecomment-1143731359

    Checklist

    • [x] The documentation reflects the changes.
    • [x] The PR branch is set up to merge into dev instead of master.
    • [x] In case I changed one of the services’ models.py I have performed the appropriate database migrations (refer to scripts/migration_manager.sh).
    • [x] In case I changed code in the orchest-sdk I followed its release checklist
    • [x] In case I changed code in the orchest-cli I followed its release checklist
    • [x] I haven't introduced breaking changes that would disrupt existing jobs, i.e. backwards compatibility is maintained.
    • [x] In case I changed the dependencies in any requirements.in I have run pip-compile to update the corresponding requirements.txt.
    new feature request 
    opened by iannbing 25
  • Improve robustness of Orchest Operator and update mechanism

    Improve robustness of Orchest Operator and update mechanism

    Description

    This PR fixes order of deployment by introducing new internal CRD named OrchestComponent. The OrchestCluster will be created by the user or the orchest-cli, then OrchestCluster controller creates different OrchestComponent for each service, then a dedicated component controller for each component, controls the status of the underlying objects of OrchestComponent

    Fixes: #952, #991

    Testing the PR

    minikube addons enable ingress
    eval $(minikube docker-env)
    scripts/build_container.sh -M -t "v2022.05.3" -o "v2022.05.3"
    pip install -e orchest-cli
    
    # yes, this is ALL that is needed now to install Orchest
    orchest install --dev
    

    Testing orchest update through the CLI

    orchest uninstall
    scripts/build_container.sh -M -t "v2022.04.4" -o "v2022.04.4"
    orchest install --dev
    scripts/build_container.sh -M -t "v2022.04.5" -o "v2022.04.5"
    orchest update --dev --version=v2022.04.5
    

    Testing orchest update through the UI

    NOTE: You need to have created the minikube with the orchest-dev-repo mount

    orchest uninstall
    scripts/build_container.sh -M -t "v2022.04.4" -o "v2022.04.4"
    orchest install --dev
    orchest patch --dev
    pnpm run dev
    scripts/build_container.sh -M -t "v2022.04.5" -o "v2022.04.5"
    INVOKE THROUGH UI (go to http://localorchest.io/update)
    scripts/build_container.sh -M -t "v2022.04.6" -o "v2022.04.6"
    INVOKE THROUGH UI (go to http://localorchest.io/update)
    ... (as often as you like)
    

    Checklist

    • [x] The documentation reflects the changes.
    • [x] The PR branch is set up to merge into dev instead of master.
    • [x] In case I changed code in the orchest-cli I followed its release checklist
    • [x] I haven't introduced breaking changes that would disrupt existing jobs, i.e. backwards compatibility is maintained.
    • [x] In case I changed the dependencies in any requirements.in I have run pip-compile to update the corresponding requirements.txt
    • [x] Start reports that Orchest is successfully started, but some deployments still have to start.
    • [x] webserver hangs then restarts after a while on start if it's started concurrently w.r.t. to the orchest-api.
    • [x] Celery fails to boot on fresh install
    • [x] reliably report the status from orchest start about availability
    • [x] check Ingress/deployment and service status of each component.
    • [x] Updating the default pvc sizes: 50 GiB, 25 GiB, 25 GiB (userdir, registry, builder cache)
    • [x] ~~make possible through CLI to specify it as well as singleNode installation.~~ --> To be done later so we can get this PR merged.
    • [x] Enable to update all controller manifest in update.
      • [x] Update CRD changes through update.
      • [x] Add all manifests into one big file
      • [x] orchest-cli use the manifest from release assets
      • [x] GitHub Action to add the yaml files as assets to the release
      • [x] Update GitHub Action of updating controller image on manifests (removed)
    • [x] Add annotation in the namespace manifest. to avoid kubectl warning (Not needed)
    • [x] Testing behavior
      • [x] orchest update through CLI
      • [x] orchest update through UI
        • [x] The UpdateView.tsx needs to parse the response from the controller endpoint correctly and show it as logs
      • [x] orchest restart through CLI and UI
      • [x] orchest patch
      • [x] orchest install
    • [x] Update documentation
      • [x] Update installation docs
      • [x] mention in the docs that ingress controller needs to be present to move Running state
      • [x] ~~Note that the docker-registry is managed through Helm and not the orchest-controller.~~
      • [x] Document about how to run the operator outside of the cluster for easy debugging
      • [x] Improve docstrings/comments around key functionality of the controller
      • [x] Controller readme.
      • [x] Remove unused Helm charts, e.g. orchest-api ones, from services/orchest-controller/deploy
      • [X] ~~Mention in docs that pvc size can only be increased & that the default storage class is used if not specified.~~ --> Will be added to the CLI in another PR
      • [x] Update internal document about the release process given the changes that took place
    • [x] renaming pause/unpause to start/stop
    • [x] Migrate the CLI to install in one namespace only. Needs to parse release asset to set the namespace.
      • [x] Check whether #866 is resolved. --> Not yet, the Helm deployer in the orchest-controller doesn't seem to be picking up custom namespaces
      • [x] Depending on the choice, the CLI orchest uninstall might need to change slightly. If we choose to go for it, then there should be a default flag.
    improvement breaking change 
    opened by nhaghighat 25
  • Improv/material UI

    Improv/material UI

    Description

    Replace MDC custom UI components with Material-UI components.

    Resolves: #413, #557, #554, #540, #598, #380, #613, #311

    Checklist

    • [x] I have manually tested the application to make sure the changes don’t cause any downstream issues, which includes making sure ./orchest status --ext is not reporting failures when Orchest is running.
    opened by iannbing 24
  •  404 error after installation on local mode

    404 error after installation on local mode

    Hello.

    I have tried to install orchest in a Linux Ubuntu (20.04) virtualization. I have followed the instructions as explained in https://docs.orchest.io/en/stable/getting_started/installation.html, with the difference that I hace used docker as the driver for minikube

    minikube start --cpus=4 --driver=docker

    The installation seems to be OK, and no error message has been diplayed.

    But when accesing localorchest.io a 404 not found message is returned from nginx. The /etc/host file has been updated and the adddress is resolved, but no page is found (or is not propertly redirected)

    Going into the minikube dashboard the services seem to be running as shown in the next picture. image

    JuanLuis has suggested using minikube addons enable ingres to try to solve the issue, but it doesn't seem to work. I have noticed that the Ingresses section from the minikube dashboard are not resolved image (1)

    Best regards, Alvaro

    bug 
    opened by AlvaroGarciaTEK 23
  • New UI design part 2: Project List

    New UI design part 2: Project List

    Description

    The new Project List view.

    Some explanations on not implementing the details in the design:

    • The drop-files-to-screen-to-create-project functionality is removed due to unclear value and usability concerns (now Examples Tab resides in the same view, it's confusing, for example, a user might want to drag-n-drop files to submit an example).
    • The pagination of Project List in the design didn't consider "specify the number of project per page", we need further discussion on this. At the moment the original TablePagination is kept.
    • I think the "Sorting" in the Example List needs more discussion: 1) it's a new feature 2) its location is far away from the list in the design, might have some usability concerns (normally the filter/sorting option should directly on top of the list). So, I decided to postpone the implementation.
    Screenshot 2022-07-06 at 11 35 41 Screenshot 2022-07-06 at 11 35 19 Screenshot 2022-07-06 at 11 35 07
    opened by iannbing 22
  • Save

    Save "dirty" open files in JupyterLab when navigating away from the JupyterLab page

    Describe the solution you'd like
    Save "dirty" open files in JupyterLab when navigating away from the JupyterLab page

    What does your solution aim to solve?
    It's not always clear to the user saving is necessary to propagate changes of behavior. In addition, unsaved changes cause a browser navigation prompt "Are you sure you want to leave..." without JupyterLab showing. Finally, it by default autosaves already so it's not a big change in behavior.

    Note: saving files in the JupyterLab UI is a bit glitchy with the recent addition of real-time collaboration in JupyterLab. So we probably want to track JupyterLab upstream closely to make sure we get rid of this glitchy-ness as soon as possible. Glitchiness is there as of JupytrLab 3.1.12.

    improvement 
    opened by ricklamers 18
  • Remove unnecessary side effects in  Step details

    Remove unnecessary side effects in Step details

    Description

    The StepDetailsProperties contains some side effects that saves pipeline steps multiple times, which could result in losing data (new data was overwritten by old data). This PR removes these side effects, and also fixes a bug in FilePicker.

    Checklist

    • [x] The PR branch is set up to merge into dev instead of master.
    opened by iannbing 17
  • Merge  the functionality of the `Pipelines` view into PipelineEditor

    Merge the functionality of the `Pipelines` view into PipelineEditor

    Description

    To simplify the workflow, this PR merges the functionality of the Pipelines view into PipelineEditor, so that user could directly manage pipelines without going back and forth between Pipelines and PipelineEditor.

    Major changes:

    • The old "Pipelines" view is removed. Therefore, the navigation item "Pipelines" in the main navigation bar will now lead to PipelineEditor directly.
    • When landing on PipelineEditor, it will load the first pipeline in the project. PipelineEditor will try to automatically load the first from the remaining pipelines in the project, for example, when user deletes the currently-open pipeline, the next one will be loaded automatically.
    • In PipelineEditor, added a "Sessions" panel for managing Orchest sessions. Added "Create Pipeline" in FileManager for opening the "Create a new pipeline" dialog. Sessions panel is vertically resizable.
    • Added the pipeline file path under the pipeline name in the HeaderBar to make it more explicit.
    Screenshot 2022-04-11 at 14 08 04

    Checklist

    • [x] The PR branch is set up to merge into dev instead of master.
    improvement 
    opened by iannbing 17
  • chore(deps): bump gitpython from 3.1.27 to 3.1.30 in /services/jupyter-server

    chore(deps): bump gitpython from 3.1.27 to 3.1.30 in /services/jupyter-server

    Bumps gitpython from 3.1.27 to 3.1.30.

    Release notes

    Sourced from gitpython's releases.

    v3.1.30 - with important security fixes

    See gitpython-developers/GitPython#1515 for details.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies python 
    opened by dependabot[bot] 1
  • Make environment builds more robust w.r.t. `.orchest` being git ignored

    Make environment builds more robust w.r.t. `.orchest` being git ignored

    Describe the problem this improvement solves Orchest expects the .orchest/environments directory to be versioned (or, at least, to not be .gitignore'd). This is because an environment build will take a snapshot of the project, and said snapshot will exclude files and directories according to the .gitignore file. However, some users would like to not version any content of the .orchest directory.

    Describe the solution you'd like The build should work regardless of the .orchest directory being in the snapshot or not. Making the build read the environment properties and setup script prior to the snapshot should be feasible. Note: the PR fixing this should include some changes to docs/source/fundamentals/environments.md to remove the notion that .orchest/environments shouldn't be git ignored (i.e. revert https://github.com/orchest/orchest/commit/02c2fa4caaf2cccd770721265b812caa437e86c0.

    good first issue improvement 
    opened by fruttasecca 0
  • WIP: New home view

    WIP: New home view

    Description

    This removes the /projects view in favor of a new more flexible "home" view, where other things than Projects can be displayed, such as all Job and Interactive Runs.

    Features

    • [x] Remove the /projects page and fix any inbound links
    • [x] Redirect /projects to /?tab=projects
    • [x] Move projects to the new home page and align components with new design
    • [ ] Update the Project selector menu with the new design
    • [x] Implement interactive job runs under "all runs"
    • [ ] Implement all job runs under "all runs"
      • [ ] Make the back-end support querying of multiple (or all) projects
      • [ ] Implement filtering, pagination, etc... in the new front-end based on the new API.

    Checklist

    • [ ] I have manually tested my changes and I am happy with the result.
    • [ ] The documentation reflects the changes.
    • [ ] The PR branch is set up to merge into dev instead of master.
    improvement 
    opened by mausworks 0
  • Environment view code editor horizontal bar overlap environment name

    Environment view code editor horizontal bar overlap environment name

    Describe the bug In the environment view, if the code editor has the horizontal scrolling bar, the bar overlaps the element containing the environment name when scrolling down the page.

    To Reproduce Steps to reproduce the behavior:

    • in the environment setup script editor, write a line that's long enough to trigger the presence of the horizontal scroll bar
    • scroll down the page

    Screenshots image

    bug 
    opened by fruttasecca 0
  • build(deps): bump certifi from 2021.10.8 to 2022.12.7 in /services/session-sidecar

    build(deps): bump certifi from 2021.10.8 to 2022.12.7 in /services/session-sidecar

    Bumps certifi from 2021.10.8 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies python 
    opened by dependabot[bot] 1
  • Tab switch URL change is debounced causing jumpy behavior

    Tab switch URL change is debounced causing jumpy behavior

    Describe the bug Tab switch URL change is debounced causing jumpy behavior

    When you on the projects page for example switch between "My projects" and "Examples" the URL changes only after a delay which causes quick navigation between the tabs to cause a jumpy behavior.

    Expected behavior When you click on the tabs quickly it should "follow your click".

    To Reproduce Steps to reproduce the behavior:

    1. Go to the Projects page
    2. Click on the Examples tab
    3. Quickly quick on the My projects tab
    4. See that it goes to "My projects" and jumps back to "Examples" shortly after (after the URL change)

    Environment

    • OS (e.g. macOS): Linux
    • Browser (e.g. Chrome): Brave
    • Orchest's version (in the settings page): v2022.11.2
    bug 
    opened by ricklamers 0
Releases(v2023.01.2)
Owner
Orchest
A new kind of IDE for Data Science.
Orchest
Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data

WeRateDogs Twitter Data from 2015 to 2017 Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data Table of Contents Introduction Proj

Keenan Cooper 1 Jan 12, 2022
cLoops2: full stack analysis tool for chromatin interactions

cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base

YaqiangCao 25 Dec 14, 2022
ASOUL直播间弹幕抓取&&数据分析

ASOUL直播间弹幕抓取&&数据分析(更新中) 这些文件用于爬取ASOUL直播间的弹幕(其他直播间也可以)和其他信息,以及简单的数据分析生成。

159 Dec 10, 2022
Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

FangWei 1 Jan 16, 2022
Orchest is a browser based IDE for Data Science.

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well

Orchest 3.6k Jan 09, 2023
Stitch together Nanopore tiled amplicon data without polishing a reference

Stitch together Nanopore tiled amplicon data using a reference guided approach Tiled amplicon data, like those produced from primers designed with pri

Amanda Warr 14 Aug 30, 2022
Data cleaning tools for Business analysis

Datacleaning datacleaning tools for Business analysis This program is made for Vicky's work. You can use it, too. 数据清洗 该数据清洗工具是为了商业分析 这个程序是为了Vicky的工作而

Lin Jian 3 Nov 16, 2021
SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).

SNV Pipeline SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).

East Genomics 1 Nov 02, 2021
yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.

The yt Project yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data. yt supports structured, varia

The yt project 367 Dec 25, 2022
Pipeline and Dataset helpers for complex algorithm evaluation.

tpcp - Tiny Pipelines for Complex Problems A generic way to build object-oriented datasets and algorithm pipelines and tools to evaluate them pip inst

Machine Learning and Data Analytics Lab FAU 3 Dec 07, 2022
Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Covid County Executive summary Setup Install miniconda, then in the command line, run conda create -n covid-county conda activate covid-county conda i

Ahmed Fasih 1 Dec 22, 2021
A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

The leading use-case for the staircase package is for the creation and analysis of step functions. Pretty exciting huh. But don't hit the close button

48 Dec 21, 2022
PyIOmica (pyiomica) is a Python package for omics analyses.

PyIOmica (pyiomica) This repository contains PyIOmica, a Python package that provides bioinformatics utilities for analyzing (dynamic) omics datasets.

G. Mias Lab 13 Jun 29, 2022
Data Competition: automated systems that can detect whether people are not wearing masks or are wearing masks incorrectly

Table of contents Introduction Dataset Model & Metrics How to Run Quickstart Install Training Evaluation Detection DATA COMPETITION The COVID-19 pande

Thanh Dat Vu 1 Feb 27, 2022
BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings.

BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings. it also can assist the binary code analysis rese

BinTuner 42 Dec 16, 2022
Gaussian processes in TensorFlow

Website | Documentation (release) | Documentation (develop) | Glossary Table of Contents What does GPflow do? Installation Getting Started with GPflow

GPflow 1.7k Jan 06, 2023
AWS Glue ETL Code Samples

AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit

AWS Samples 1.2k Jan 03, 2023
Airflow ETL With EKS EFS Sagemaker

Airflow ETL With EKS EFS & Sagemaker (en desarrollo) Diagrama de la solución Imp

1 Feb 14, 2022
OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase working capital.

Overview OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase

Tom 3 Feb 12, 2022