Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure.

Overview

Event Queue Dialect

Event queue (Equeue) dialect is an MLIR Dialect that models concurrent devices in terms of control and structure.

Motivation

The main motivation of the event queue dialect is to efficiently estimate performance of programs running on heterogenous accelerators. The dialect is designed to bridge the gap between low-level hardware specific dialects and high-level dialects with little hardware specific information, thus facilitating custom lowering among different design choices. In particular, the EventQueue dialect supports modeling memory size constraints, bandwidth constraints, and processing time across a large number of heterogenous processors with distributed event-based control.

By and large, event queue dialect is design to estimate performance of concurrent devices. It supports:

  • Arbitrary hardware hierarchy and each hardware with its own properties.

  • Modeling data movement and buffer allocation that is critical to energy and efficiency estimation.

  • Model concurrency between heterogenous devices.

Check further documentation to see how the goals are achieved.

EQueue Dialect in MLIR Lowering Pipeline

lowering_pipeline

Event queue dialect is designed to do performance analysis.

Because there is a gap between high level dialect that has no structure information, and low level dialect that is too detail to analyze, event queue dialect bridges them.

The input for the event queue dialect is high level control dialect without structure and the output will be dialect describing detailed structure information.

In the lowering pipeline, equeue dialect is at the same level as gpu dialect. The difference is that existing gpu dialect assumes a synchronous gpu model and try to communicate with gpu.barrier among concurrent gpus, while equeue dialect models a more general design, where it allows any kinds of structure, thus allowing maximum flexibility. To describe the complexity of any possible structure in a flexible device like FPGA, equeue dialect develops a general semantics for asynchronous communication between concurrent devices.

How to Use

Dependency

The dependency of this project is MLIR. Because MLIR is project that frequently being updated. When I started the EQueue project, The latest stable version was 12-init. One needs checkout to the right version.

git clone https://github.com/llvm/llvm-project.git
git fetch --all --tags
git checkout tags/llvmorg-12-init -b 
   

   

and then follow MLIR quick start to build executable.

Quick Start

After git clone and cd the repo,

mkdir build
cp *.sh build/
cd build
#change LLVM_EXTERNAL_LIT and MLIR_DIR in run.sh to your local directory
sh config; sh run.sh
./bin/equeue-opt ../test/Equeue/[path-to-input-file.mlir]

Debug Outputs

If one want to turn on debug outputs with -debug or debug-only when there are multiple debugging options

./bin/equeue-opt ../test/Equeue/[path-to-input-file.mlir] -debug
# when there are multiple debugging options
./bin/equeue-opt ../test/Equeue/[path-to-input-file.mlir] -debug-only=command_processor
# to redirect output to file
./bin/equeue-opt ../test/Equeue/[path-to-input-file.mlir] -debug > & report

Visualization

By default equeue-opt will generate a Trace Event Format JSON file to test/Equeue/out.json . You can specify the output file name with -json

./bin/equeue-opt ../test/Equeue/[path-to-input-file.mlir] -json [path-to-json-file.json]

The output JSON file can be viewed in chrome://tracing/

Below is the visualization of running test/EQueue/gpu.mlir

visualization

Examples

You may want to check on Examples on the convolution and the finite impulse response. Detailed explanation can be found in the example directory

Paper and Citation

The paper is accepted to HPCA 2022. We upload a preprint to Arxiv.

Contact

I am Zhijing at Cornell University. This project is originally my Xilinx internship project. I extend after the internship and now it is accepted by HPCA 2022. I will put the reference later. If getting to any trouble, you can contact me at [email protected]

Owner
Cornell Capra
Computer architecture & programming abstractions at Cornell University.
Cornell Capra
Qlib is an AI-oriented quantitative investment platform

Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.

Microsoft 10.1k Dec 30, 2022
Saliency - Framework-agnostic implementation for state-of-the-art saliency methods (XRAI, BlurIG, SmoothGrad, and more).

Saliency Methods 🔴 Now framework-agnostic! (Example core notebook) 🔴 🔗 For further explanation of the methods and more examples of the resulting ma

PAIR code 849 Dec 27, 2022
Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

S2VD Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021) Requirements and Dependencies Ubuntu 16.04, cuda 10.0 Python 3.6.10, P

Zongsheng Yue 53 Nov 23, 2022
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

DALL-E in Pytorch Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch. It will also contain CLIP for ranking the ge

Phil Wang 5k Jan 04, 2023
Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 06, 2022
FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI 声明: 本项目仅限于学习交流,不可用于非法用途,包括但不限于:用于游戏外挂等,使用本项目产生的任何后果与本人无关! 简介 本项目基于yolov5,实现了一款FPS类游戏(CF、CSGO等)的自瞄AI,本项目旨在使用现

Fabian 246 Dec 28, 2022
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

amirsalar 13 Jan 18, 2022
Region-aware Contrastive Learning for Semantic Segmentation, ICCV 2021

Region-aware Contrastive Learning for Semantic Segmentation, ICCV 2021 Abstract Recent works have made great success in semantic segmentation by explo

Hanzhe Hu 30 Dec 29, 2022
SOLOv2 on onnx & tensorRT

SOLOv2.tensorRT: NOTE: code based on WXinlong/SOLO add support to TensorRT inference onnxruntime tensorRT full_dims and dynamic shape postprocess with

47 Nov 26, 2022
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

19 Oct 27, 2022
Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

query-gpuinfo-data License This software is not presently released under a license. The data in data/ is obtained under CC BY 4.0 as specified there.

Kai Ninomiya 5 Jul 18, 2022
HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep.

HODEmu HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep. and emulates satellite abundance as a function of co

Antonio Ragagnin 1 Oct 13, 2021
Code for the Image similarity challenge.

ISC 2021 This repository contains code for the Image Similarity Challenge 2021. Getting started The docs subdirectory has step-by-step instructions on

Facebook Research 173 Dec 12, 2022
Python package provinding tools for artistic interactive applications using AI

Documentation redrawing Python package provinding tools for artistic interactive applications using AI Created by ReDrawing Campinas team for the Open

ReDrawing Campinas 1 Sep 30, 2021
A crossplatform menu bar application using mpv as DLNA Media Renderer.

Macast Chinese README A menu bar application using mpv as DLNA Media Renderer. Install MacOS || Windows || Debian Download link: Macast release latest

4.4k Jan 01, 2023
PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Future urban scene generation through vehicle synthesis This repository contains Pytorch code for the ICPR2020 paper "Future Urban Scene Generation Th

Alessandro Simoni 4 Oct 11, 2021
CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

CharacterGAN Implementation of the paper "CharacterGAN: Few-Shot Keypoint Character Animation and Reposing" by Tobias Hinz, Matthew Fisher, Oliver Wan

Tobias Hinz 181 Dec 27, 2022
Ejemplo Algoritmo Viterbi - Example of a Viterbi algorithm applied to a hidden Markov model on DNA sequence

Ejemplo Algoritmo Viterbi Ejemplo de un algoritmo Viterbi aplicado a modelo ocul

Mateo Velásquez Molina 1 Jan 10, 2022
SigOpt wrappers for scikit-learn methods

SigOpt + scikit-learn Interfacing This package implements useful interfaces and wrappers for using SigOpt and scikit-learn together Getting Started In

SigOpt 73 Sep 30, 2022
classification task on dataset-CIFAR10,by using Tensorflow/keras

CIFAR10-Tensorflow classification task on dataset-CIFAR10,by using Tensorflow/keras 在这一个库中,我使用Tensorflow与keras框架搭建了几个卷积神经网络模型,针对CIFAR10数据集进行了训练与测试。分别使

3 Oct 17, 2021