A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

Last update: May 08, 2022

Related tags

Overview

MPT

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities.

Implementation for our AAAI 2022 paper: Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking.

Our paper and code will be released soon.

Owner

yidiLi

北京大学渣

GitHub Repository

Pytorch library for fast transformer implementations

Transformers are very successful models that achieve state of the art performance in many natural language tasks

1.3k Dec 30, 2022

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

NLOS-OT Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted) Description In this reposit

16 Dec 16, 2022

GeDML is an easy-to-use generalized deep metric learning library

32 Dec 05, 2022

SberSwap Video Swap base on deep learning

431 Jan 03, 2023

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

71 Dec 29, 2022

Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

91 Dec 02, 2022

TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

296 Dec 06, 2022

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

14 Nov 13, 2022

Object recognition using Azure Custom Vision AI and Azure Functions

Step by Step on how to create an object recognition model using Custom Vision, export the model and run the model in an Azure Function

11 Jul 08, 2022

Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

Pixel Transposed Convolutional Networks Created by Hongyang Gao, Hao Yuan, Zhengyang Wang and Shuiwang Ji at Texas A&M University. Introduction Pixel

95 Jul 24, 2022

An implementation of the AdaOPS (Adaptive Online Packing-based Search), which is an online POMDP Solver used to solve problems defined with the POMDPs.jl generative interface.

AdaOPS An implementation of the AdaOPS (Adaptive Online Packing-guided Search), which is an online POMDP Solver used to solve problems defined with th

9 Oct 05, 2022

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

Related tags

Overview

MPT

Owner

yidiLi

Pytorch library for fast transformer implementations

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

GeDML is an easy-to-use generalized deep metric learning library

SberSwap Video Swap base on deep learning

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

TuckER: Tensor Factorization for Knowledge Graph Completion

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

Object recognition using Azure Custom Vision AI and Azure Functions

Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

An implementation of the AdaOPS (Adaptive Online Packing-based Search), which is an online POMDP Solver used to solve problems defined with the POMDPs.jl generative interface.

Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

This project uses Template Matching technique for object detecting by detection of template image over base image.

Housing Price Prediction

Supervised Classification from Text (P)

[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

Detection of PCBA defect

Text-Based Ideal Points

Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy