Implementation of paper "DeepTag: A General Framework for Fiducial Marker Design and Detection"

Last update: Dec 12, 2022

Related tags

Deep Learning deeptag-pytorch

Overview

Implementation of paper DeepTag: A General Framework for Fiducial Marker Design and Detection.

Project page: https://herohuyongtao.github.io/research/publications/deep-tag/.

Overview

DeepTag is a general framework for fiducial marker design and detection, which supports existing and newly-designed marker families. DeepTag is a two-stage marker detection pipeline:

Stage-1: detect ROIs of potential markers;
Stage-2: detect keypoints and digital symbols inside each ROI, then determine 6-DoF pose and marker ID.

How to run

For image input:

python test_deeptag.py --config config_image.json

For video input:

python test_deeptag.py --config config_video.json

The configuration file is in JSON format. Please modify the configurations to fit your needs. Example configurations files for image and video input are provided (i.e., config_image.json and config_video.json).

Detail explaination of configuration file:

is_video: {0, 1} for image/video respectively.
filepath: path of input image/video (use 0 for webcam input).
family: marker family, currently support {apriltag, aruco, artoolkitplus, runetag, topotag, apriltagxo}.
hamming_dist: Hamming dist for checking the marker library; normally, 4 works well enough.
codebook: path of codebook; if it is empty, the default path codebook/FAMILY_codebook.txt will be used. For markers with multiple codebooks like AprilTag and ArUco, their default codebooks are for AprilTag (36h11) and ArUco (36h12) respectively.
cameraMatrix: camera intrinsic matrix, [fx, 0, cx, 0, fy, cy, 0, 0, 1].
distCoeffs: camera distortion coefficients (both radial and tangential), [k1, k2, p1, p2, k3, k4, k5, k6].
marker_size: physical size of the marker.

Besides supporting existing markers like AprilTag, ArUco, ARToolkitPlus, TopoTag & RuneTag, DeepTag also supports newly-designed markers like AprilTag-XO, AprilTag-XA and RuneTag+ (provided in folders images_tag). Set family to apriltagxo in config for AprilTag-XO and AprilTag-XA, and runetag for RuneTag+ respectively.

Terms of use

The source code is provided for research purposes only. Any commercial use is prohibited. When using the code in your research work, please cite the following paper:

"DeepTag: A General Framework for Fiducial Marker Design and Detection."
Zhuming Zhang, Yongtao Hu, Guoxing Yu, and Jingwen Dai
arXiv:2105.13731 (2021).

@article{zhang2021deeptag,
  title={{DeepTag: A General Framework for Fiducial Marker Design and Detection}},
  author={Zhang, Zhuming and Hu, Yongtao and Yu, Guoxing and Dai, Jingwen},
  year={2021},
  eprint={2105.13731},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Contact

If you find any bug or have any question about the code, please report to the Issues page.

Implementation of paper "DeepTag: A General Framework for Fiducial Marker Design and Detection"

Related tags

Overview

Overview

How to run

Terms of use

Contact

Owner

Yongtao Hu

Implementation for paper: Self-Regulation for Semantic Segmentation

[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

Object-aware Contrastive Learning for Debiased Scene Representation

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

给yolov5加个gui界面，使用pyqt5，yolov5是5.0版本

Fully Convlutional Neural Networks for state-of-the-art time series classification

Lua-parser-lark - An out-of-box Lua parser written in Lark

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

Cowsay - A rewrite of cowsay in python

Automatic labeling, conversion of different data set formats, sample size statistics, model cascade

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

Introducing neural networks to predict stock prices

Text Generation by Learning from Demonstrations

This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems.

Repositório para arquivos sobre o Módulo 1 do curso Top Coders da Let's Code + Safra

A working implementation of the Categorical DQN (Distributional RL).