Model-based reinforcement learning in TensorFlow

Overview

Bellman

PyPI version Coverage Status Quality checks Slow tests Docs build Code style: black Slack Status

Website | Twitter | Documentation (latest)

What does Bellman do?

Bellman is a package for model-based reinforcement learning (MBRL) in Python, using TensorFlow and building on top of model-free reinforcement learning package TensorFlow Agents.

Bellman provides a framework for flexible composition of model-based reinforcement learning algorithms. It offers two major classes of algorithms: decision time planning and background planning algorithms. With each class any kind of supervised learning method can be easily used to learn certain component of the environment. Bellman was designed with modularity in mind - important components can be flexibly combined, such as type of decision time planning method (e.g. a cross entropy method or a random shooting method) and type of model for state transition (e.g. a probabilistic neural network or an ensemble of neural networks). Bellman also provides implementation of several popular state-of-the-art MBRL algorithms, such as PETS, MBPO and METRPO. The online documentation (latest) contains more details.

Bellman requires Python 3.7 onwards and uses TensorFlow 2.4+ for running computations, which allows fast execution on GPUs.

Maintainers

Bellman was originally created by (in alphabetical order) Vincent Adam, Jordi Grau-Moya, Felix Leibfried, John A. McLeod, Hrvoje Stojic, and Peter Vrancx, at Secondmind Labs.

It is now actively maintained by (in alphabetical order) Felix Leibfried, John A. McLeod, Hrvoje Stojic, and Peter Vrancx.

Bellman is an open source project. If you have relevant skills and are interested in contributing then please do contact us (see "The Bellman Community" section below).

We are very grateful to our Secondmind Labs colleagues, maintainers of GPflow and Trieste in particular, for their help with creating contributing guidelines, instructions for users and open-sourcing in general.

Install Bellman

For users

For latest (stable) release from PyPI you can use pip to install the toolbox

$ pip install bellman

Use pip to install the toolbox from latest source from GitHub. Check-out the develop branch of the Bellman GitHub repository, and in the repository root run

$ pip install -e .

This will install the toolbox in editable mode.

For contributors

If you wish to contribute please use Poetry to manage dependencies in a local virtual environment. Poetry configuration file specifies all the development dependencies (testing, linting, typing, docs etc) and makes it much easier to contribute. To install Poetry, follow the instructions in the Poetry documentation.

To install this project in editable mode, run the commands below from the root directory of the bellman repository.

poetry install

This command creates a virtual environment for this project in a hidden .venv directory under the root directory. You can easily activate it with

poetry shell

You must also run the poetry install command to install updated dependencies when the pyproject.toml file is updated, for example after a git pull.

Installing MuJoCo (Optional)

Many benchmarks in continuous control in MBRL use the MuJoCo physics engine. Some of the TF-Agents examples have been tested against Mujoco environments as well. MuJoCo is proprietary software that requires a license (see MuJoCo website). As a result installing it is optional, but because of its importance to the research community it is highly recommended. Don't worry if you decide not to install MuJoCo though, all our examples and notebooks rely on standard environments available in OpenAI Gym.

We interface with MuJoCo through a python library mujoco-py via OpenAI Gym (mujoco-py github page). Check the installation instructions there on how to install MuJoCo. Note that you should install MuJoCo 1.5 since OpenAI Gym supports that version. After that you can install mujoco-py library with an additional Poetry command:

poetry install -E mujoco-py

If this command fails, please check troubleshooting sections at mujoco-py github page, you might need to satisfy other mujoco-py dependencies (e.g. Linux system libraries) or set some environment variables.

The Bellman Community

Getting help

Bugs, feature requests, pain points, annoying design quirks, etc: Please use GitHub issues to flag up bugs/issues/pain points, suggest new features, and discuss anything else related to the use of Bellman that in some sense involves changing the Bellman code itself. We positively welcome comments or concerns about usability, and suggestions for changes at any level of design. We aim to respond to issues promptly, but if you believe we may have forgotten about an issue, please feel free to add another comment to remind us.

"How-to-use" questions: Please use Stack Overflow (Bellman tag) to ask questions that relate to "how to use Bellman", i.e. questions of understanding rather than issues that require changing Bellman code. (If you are unsure where to ask, you are always welcome to open a GitHub issue; we may then ask you to move your question to Stack Overflow.)

Slack workspace

We have a public Bellman slack workspace. Please use this invite link if you'd like to join, whether to ask short informal questions or to be involved in the discussion and future development of Bellman.

Contributing

All constructive input is very much welcome. For detailed information, see the guidelines for contributors.

Citing Bellman

To cite Bellman, please reference our arXiv paper where we review the framework and describe the design. Sample Bibtex is given below:

@article{bellman2021,
    author = {McLeod, John and Stojic, Hrvoje and Adam, Vincent and Kim, Dongho and Grau-Moya, Jordi and Vrancx, Peter and Leibfried, Felix},
    title = {Bellman: A Toolbox for Model-based Reinforcement Learning in TensorFlow},
    year = {2021},
    journal = {arXiv:2103.14407},
    url = {https://arxiv.org/abs/2103.14407}
}

License

Apache License 2.0

Comments
  • Dongho/tensorflow 2.5

    Dongho/tensorflow 2.5

    PR type: bugfix / enhancement / new feature / doc improvement

    Related issue(s)/PRs:

    Summary

    Proposed changes

    • Quick fix setup.py to version up tensorflow and other related packages

    What alternatives have you considered?

    Minimal working example

    PR checklist

    • [ ] New features: code is well-documented
      • [ ] detailed docstrings (API documentation)
      • [ ] notebook examples (usage demonstration)
    • [ ] The bug case / new feature is covered by unit tests
    • [ ] Code has type annotations
    • [ ] I ran the black+isort formatter
    • [ ] I locally tested that the tests pass

    Release notes

    Fully backwards compatible: yes

    If not, why is it worth breaking backwards compatibility:

    Commit message (for release notes):

    • Quick fix for setup.py
    opened by dongho-kim 1
  • setting things up for pypi

    setting things up for pypi

    2 things I would need some help with:

    • pyproject.toml - [build-system] currently points to poetry, is that fine for building a package for pip?
    • I'm not convinced we need all the libraries listed in install_requires in setup.py - @johnamcleod you were taking care of dependencies before, can you give a hand here please?

    I have set up a workflow for psuhing things to PyPi automatically, not sure how to test it though (hm, perhaps I could modfiy it to use test PyPi...) I will first push things to test PyPi, to verify things work as intended

    enhancement 
    opened by hstojic 1
  • Dongho/tensorflow 2.5

    Dongho/tensorflow 2.5

    PR type: enhancement

    Related issue(s)/PRs:

    Summary

    Proposed changes

    • Support tensorflow 2.5, tf-agents 0.8.0 and tensorflow-probability 0.12.2
    • Fixes for test errors which possibly occurs on Mac (inc. Apple Silicon) environment

    What alternatives have you considered?

    Minimal working example

    NA as no new features added

    PR checklist

    • [ ] New features: code is well-documented
      • [ ] detailed docstrings (API documentation)
      • [ ] notebook examples (usage demonstration)
    • [ ] The bug case / new feature is covered by unit tests
    • [ ] Code has type annotations
    • [ ] I ran the black+isort formatter
    • [X] I locally tested that the tests pass

    Release notes

    Fully backwards compatible: no

    If not, why is it worth breaking backwards compatibility:

    Changes in TFAgent.init introduced in later tf-agents seem to break backwards compatibility, causing errors when we pass TRAIN_ARGSPEC. However this is worth breaking due to the security vulnerability in tensorflow 2.4.0.

    Commit message (for release notes):

    • Support tensorflow 2.5, tf-agents 0.8.0 and tensorflow-probability 0.12.2
    enhancement good first issue 
    opened by dongho-kim 0
  • Add MBPO train_eval function

    Add MBPO train_eval function

    PR type: enhancement

    Related issue(s)/PRs: fix #24

    Summary

    Proposed changes The MBPO agent does not have a train_eval function in the benchmark package. This PR fixes that.

    What alternatives have you considered?

    Minimal working example

    Look at the run_mbpo example.

    Release notes

    Fully backwards compatible: yes If not, why is it worth breaking backwards compatibility:

    Commit message (for release notes):

    • Add a train_eval function for the MBPO agent.
    enhancement 
    opened by johnamcleod 0
  • John/fix none loss in harness

    John/fix none loss in harness

    PR type: bugfix

    **Related issue(s)/PRs: N/A

    Summary

    Proposed changes There is an integration issue between the TFTrainingScheduler and the ExperimentHarness where if the call to the agent trainer's train_step method returns None for the loss, the harness throws an exception when trying to write the logs. This situation can occur when insufficiently many environment steps have passed to train a model-free agent component of a model-based agent.

    This PR addresses the issue by intercepting the None loss from the agent trainer in the scheduler and not adding it to the training_info dictionary.

    Minimal working example

    The run_mbpo example hits this problem on the first environment time step.

    PR checklist

    • [ ] New features: code is well-documented
      • [ ] detailed docstrings (API documentation)
      • [ ] notebook examples (usage demonstration)
    • [x] The bug case / new feature is covered by unit tests
    • [x] Code has type annotations
    • [x] I ran the black+isort formatter
    • [x] I locally tested that the tests pass

    Release notes

    Fully backwards compatible: yes

    If not, why is it worth breaking backwards compatibility:

    Commit message (for release notes):

    • ...
    bug 
    opened by johnamcleod 0
  • upload-pypi.yaml fails on `main`

    upload-pypi.yaml fails on `main`

    GH action fails on "Verify git tag vs. VERSION" step, $GITHUB_REF env variable seems to come with refs/tags/ bit pre-pended, which code does not allow for - here is a solution: https://github.community/t/how-to-get-just-the-tag-name/16241

    bug 
    opened by hstojic 0
  • Release/0.1.0

    Release/0.1.0

    updated develop with few small corrections for merging into main as a (pre-)release 0.1.0 it seems we can then create a release out of that version of main on GH with a description of the changelog. That should create a tag.

    release 
    opened by hstojic 0
  • Hstojic/trigger docs

    Hstojic/trigger docs

    modified a github action to trigger generating documentation in the website repo instead action sends an event that an action in website repo is listening to tested and it seems to work, check https://belman.dev/docs

    see:

    • https://docs.github.com/en/actions/reference/events-that-trigger-workflows#external-events-repository_dispatch
    • https://docs.github.com/en/rest/reference/repos#create-a-repository-dispatch-event
    • https://docs.github.com/en/developers/webhooks-and-events/webhook-events-and-payloads#repository_dispatch
    documentation enhancement 
    opened by hstojic 0
  • Felix/initial commit

    Felix/initial commit

    PR type: bugfix / enhancement / new feature / doc improvement

    Related issue(s)/PRs:

    Summary

    Proposed changes

    • ...
    • ...
    • ...

    What alternatives have you considered?

    Minimal working example

    # Put your example code in here
    

    PR checklist

    • [ ] New features: code is well-documented
      • [ ] detailed docstrings (API documentation)
      • [ ] notebook examples (usage demonstration)
    • [ ] The bug case / new feature is covered by unit tests
    • [ ] Code has type annotations
    • [ ] I ran the black+isort formatter
    • [ ] I locally tested that the tests pass

    Release notes

    Fully backwards compatible: yes / no

    If not, why is it worth breaking backwards compatibility:

    Commit message (for release notes):

    • ...
    opened by fleibfried 0
  • poetry task check_requirements

    poetry task check_requirements

    Feature request

    Different from the description in CONTRIBUTING.md, it doesn't seem that we can run poetry run task check_requirements as the task doesn't seem to be defined anywhere. Would be great to add this feature back.

    Motivation

    Is your feature request related to a problem?

    It is unclear how to automatically update setup.py when we update poetry.

    Proposal

    Describe the solution you would like

    What alternatives have you considered?

    Are you willing to open a pull request? (We really appreciate contributions!)

    Additional context

    enhancement 
    opened by dongho-kim 0
Releases(v0.1.0)
  • v0.1.0(Apr 7, 2021)

    First release, 0.1.0

    (well, a pre-release actually :)

    What is Bellman?

    Bellman is a package for model-based reinforcement learning (MBRL) in Python, using TensorFlow 2.4+ and building on top of model-free reinforcement learning package TensorFlow Agents.

    Main features

    • A framework for flexible composition of model-based reinforcement learning algorithms.
    • It offers modular components for composing two major classes of algorithms:
      1. decision time planning
      2. background planning
    • Keras neural networks for modeling transition dynamics
    • Rewards, termination and initial state distributions are assumed to be known for now
    • Implementations of several state-of-the-art model-based algorithms (PETS, MBPO and METRPO) and one model-free algorithm (TRPO)
    Source code(tar.gz)
    Source code(zip)
This repository contains implementations and illustrative code to accompany DeepMind publications

DeepMind Research This repository contains implementations and illustrative code to accompany DeepMind publications. Along with publishing papers to a

DeepMind 11.3k Dec 31, 2022
Pytorch implementation for RelTransformer

RelTransformer Our Architecture This is a Pytorch implementation for RelTransformer The implementation for Evaluating on VG200 can be found here Requi

Vision CAIR Research Group, KAUST 21 Nov 22, 2022
Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng, Hongwei Yong, Lei Zhang, "Unfolded Deep Kernel Estimation for Blind Ima

Z80 15 Dec 26, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Benchmark Model Electrical and ICT System This repository contains the documentation, code, and models for the electrical and ICT benchmark model deve

ERIGrid 2.0 1 Nov 29, 2021
Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the Machine Learning 4 Health Workshop

Detection-aided liver lesion segmentation Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the

Image Processing Group - BarcelonaTECH - UPC 96 Oct 26, 2022
Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.

snc4onnx Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools 1.

Katsuya Hyodo 8 Oct 13, 2022
The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks This folder contains the code to reproduce the data in "The Implicit Bias o

Samuel Lippl 0 Feb 05, 2022
This is a Tensorflow implementation of Learning to See in the Dark in CVPR 2018

Learning-to-See-in-the-Dark This is a Tensorflow implementation of Learning to See in the Dark in CVPR 2018, by Chen Chen, Qifeng Chen, Jia Xu, and Vl

5.3k Jan 01, 2023
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Poisson Image Editing - A Parallel Implementation Jiayi Weng (jiayiwen), Zixu Chen (zixuc) Poisson Image Editing is a technique that can fuse two imag

Jiayi Weng 110 Dec 27, 2022
Imaging, analysis, and simulation software for radio interferometry

ehtim (eht-imaging) Python modules for simulating and manipulating VLBI data and producing images with regularized maximum likelihood methods. This ve

Andrew Chael 5.2k Dec 28, 2022
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Do you want a RL agent nicely moving on Atari? Rainbow is all you need! This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains bo

Jinwoo Park (Curt) 1.4k Dec 29, 2022
Realistic lighting in ursina!

Ursina Lighting Realistic lighting in ursina! If you want to have realistic lighting in ursina, import the UrsinaLighting.py in your project and use t

17 Jul 07, 2022
基于PaddleClas实现垃圾分类,并转换为inference格式用PaddleHub服务端部署

百度网盘链接及提取码: 链接:https://pan.baidu.com/s/1HKpgakNx1hNlOuZJuW6T1w 提取码:wylx 一个垃圾分类项目带你玩转飞桨多个产品(1) 基于PaddleClas实现垃圾分类,导出inference模型并利用PaddleHub Serving进行服务

thomas-yanxin 22 Jul 12, 2022
Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Qingshan Xu 127 Dec 31, 2022
Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

CodeFlare 32 Dec 25, 2022
Jiminy Cricket Environment (NeurIPS 2021)

Jiminy Cricket This is the repository for "What Would Jiminy Cricket Do? Towards Agents That Behave Morally" by Dan Hendrycks*, Mantas Mazeika*, Andy

Dan Hendrycks 15 Aug 29, 2022
🇰🇷 Text to Image in Korean

KoDALLE Utilizing pretrained language model’s token embedding layer and position embedding layer as DALLE’s text encoder. Background Training DALLE mo

HappyFace 74 Sep 22, 2022
Boundary IoU API (Beta version)

Boundary IoU API (Beta version) Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov [arXiv] [Project] [BibTeX] This API is

Bowen Cheng 177 Dec 29, 2022
Reproducing Results from A Hybrid Approach to Targeting Social Assistance

title author date output Reproducing Results from A Hybrid Approach to Targeting Social Assistance Lendie Follett and Heath Henderson 12/28/2021 html_

Lendie Follett 0 Jan 06, 2022