Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Last update: Dec 07, 2022

Overview

Phoenix-Drone-Simulation

An OpenAI Gym environment based on PyBullet for learning to control the CrazyFlie quadrotor:

Can be used for Reinforcement Learning (check out the examples!) or Model Predictive Control
We used this repository for sim-to-real transfer experiments (see publication [1] below)
The implemented dynamics model is based on the Bitcraze's Crazyflie 2.1 nano-quadrotor

Circle Task	TakeOff

The following tasks are currently available to fly the little drone:

Hover
Circle
Take-off (implemented but not yet working properly: reward function must be tuned!)
~~Reach~~ (not yet implemented)

Overview of Environments

	Task	Controller	Physics	Observation Frequency	Domain Randomization	Aerodynamic effects	Motor Dynamics
`DroneHoverSimpleEnv-v0`	Hover	PWM (100Hz)	Simple	100 Hz	10%	None	Instant force
`DroneHoverBulletEnv-v0`	Hover	PWM (100Hz)	PyBullet	100 Hz	10%	None	First-order
`DroneCircleSimpleEnv-v0`	Circle	PWM (100Hz)	Simple	100 Hz	10%	None	Instant force
`DroneCircleBulletEnv-v0`	Circle	PWM (100Hz)	PyBullet	100 Hz	10%	None	First-order
`DroneTakeOffSimpleEnv-v0`	Take-off	PWM (100Hz)	Simple	100 Hz	10%	Ground-effect	Instant force
`DroneTakeOffBulletEnv-v0`	Take-off	PWM (100Hz)	PyBullet	100 Hz	10%	Ground-effect	First-order

Installation and Requirements

Here are the (few) steps to follow to get our repository ready to run. Clone the repository and install the phoenix-drone-simulation package via pip. Note that everything after a $ is entered on a terminal, while everything after >>> is passed to a Python interpreter. Please, use the following three steps for installation:

$ git clone https://github.com/SvenGronauer/phoenix-drone-simulation
$ cd phoenix-drone-simulation/
$ pip install -e .

This package follows OpenAI's Gym Interface.

Note: if your default python is 2.7, in the following, replace pip with pip3 and python with python3

Supported Systems

We tested this package under Ubuntu 20.04 and Mac OS X 11.2 running Python 3.7 and 3.8. Other system might work as well but have not been tested yet. Note that PyBullet supports Windows as platform only experimentally!.

Dependencies

Bullet-Safety-Gym heavily depends on two packages:

Gym
PyBullet

Getting Started

After the successful installation of the repository, the Bullet-Safety-Gym environments can be simply instantiated via gym.make. See:

>>> import gym
>>> import phoenix_drone_simulation
>>> env = gym.make('DroneHoverBulletEnv-v0')

The functional interface follows the API of the OpenAI Gym (Brockman et al., 2016) that consists of the three following important functions:

>>> observation = env.reset()
>>> random_action = env.action_space.sample()  # usually the action is determined by a policy
>>> next_observation, reward, done, info = env.step(random_action)

A minimal code for visualizing a uniformly random policy in a GUI, can be seen in:

import gym
import time
import phoenix_drone_simulation

env = gym.make('DroneHoverBulletEnv-v0')

while True:
    done = False
    env.render()  # make GUI of PyBullet appear
    x = env.reset()
    while not done:
        random_action = env.action_space.sample()
        x, reward, done, info = env.step(random_action)
        time.sleep(0.05)

Note that only calling the render function before the reset function triggers visuals.

Training Policies

To train an agent with the PPO algorithm call:

$ python -m phoenix_drone_simulation.train --alg ppo --env DroneHoverBulletEnv-v0

This works with basically every environment that is compatible with the OpenAI Gym interface:

$ python -m phoenix_drone_simulation.train --alg ppo --env CartPole-v0

After an RL model has been trained and its checkpoint has been saved on your disk, you can visualize the checkpoint:

$ python -m phoenix_drone_simulation.play --ckpt PATH_TO_CKPT

where PATH_TO_CKPT is the path to the checkpoint, e.g. /var/tmp/sven/DroneHoverSimpleEnv-v0/trpo/2021-11-16__16-08-09/seed_51544

Examples

`generate_trajectories.py`

See the generate_trajectories.py script which shows how to generate data batches of size N. Use generate_trajectories.py --play to visualize the policy in PyBullet simulator.

`train_drone_hover.py`

Use Reinforcement Learning (RL) to learn the drone holding its position at (0, 0, 1). This canonical example relies on the RL-safety-Algorithms repository which is a very strong framework for parallel RL algorithm training.

`transfer_learning_drone_hover.py`

Shows a transfer learning approach. We first train a PPO model in the source domain DroneHoverSimpleEnv-v0 and then re-train the model on a more complex target domain DroneHoverBulletEnv-v0. Note that the DroneHoverBulletEnv-v0 environment builds upon an accurate motor modelling of the CrazyFlie drone and includes a motor dead time as well as a motor lag.

Tools

convert.py @ Sven Gronauer

A function used by Sven to extract the policy networks from his trained Actor Critic module and convert the model to a json file format.

Version History and Changes

Version	Changes	Date
v1.0	Public Release: Simulation parameters as proposed in Publication [1]	19.04.2022
v0.2	Add: accurate motor dynamic model and first real-world transfer insights	21.09.2021
v0.1	Re-factor: of repository (only Hover task yet implemented)	18.05.2021
v0.0	Fork: from Gym-PyBullet-Drones Repo	01.12.2020

Publications

Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors

Sven Gronauer, Matthias Kissel, Luca Sacchetto, Mathias Korte, Klaus Diepold

https://arxiv.org/abs/2201.01369

Lastly, we want to thank:

Jacopo Panerati and his team for contributing the Gym-PyBullet-Drones Repo which was the staring point for this repository.
Artem Molchanov and collaborators for their hints about the CrazyFlie Firmware and the motor dynamics in their paper "Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors"
Jakob Foerster for this Bachelor Thesis and his insights about the CrazyFlie's parameter values

This repository has been develepod at the

Chair of Data Processing
TUM School of Computation, Information and Technology
Technical University of Munich

Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Related tags

Overview

Phoenix-Drone-Simulation

Overview of Environments

Installation and Requirements

Supported Systems

Dependencies

Getting Started

Training Policies

Examples

`generate_trajectories.py`

`train_drone_hover.py`

`transfer_learning_drone_hover.py`

Tools

Version History and Changes

Publications

Owner

Sven Gronauer

(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

GANsformer: Generative Adversarial Transformers Drew A

A collection of loss functions for medical image segmentation

😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

DTCN IJCAI - Sequential prediction learning framework and algorithm

Models Supported: AlbUNet [18, 34, 50, 101, 152] (1D and 2D versions for Single and Multiclass Segmentation, Feature Extraction with supports for Deep Supervision and Guided Attention)

Code release for Local Light Field Fusion at SIGGRAPH 2019

Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

PyTorch source code for Distilling Knowledge by Mimicking Features

2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

A transformer which can randomly augment VOC format dataset (both image and bbox) online.

CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.