Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Last update: Dec 10, 2022

Related tags

Overview

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Our paper is accepted by ICCV2021.

Picture: Overview of the proposed Plug-and-Play (PnP) adaption framework for generalizing gaze estimation to a new domain.

Picture: The proposed architecture.

Results

Input	Method	D_E→D_M	D_E→D_D	D_G→D_M	D_G→D_D
Face	Baseline	8.767	8.578	7.662	8.977
Face	Baseline + PnP-GA	5.529 ↓36.9%	5.867 ↓31.6%	6.176 ↓19.4%	7.922 ↓11.8%
Face	ResNet50	8.017	8.310	8.328	7.549
Face	ResNet50 + PnP-GA	6.000 ↓25.2%	6.172 ↓25.7%	5.739 ↓31.1%	7.042 ↓6.7%
Face	SWCNN	10.939	24.941	10.021	13.473
Face	SWCNN + PnP-GA	8.139 ↓25.6%	15.794 ↓36.7%	8.740 ↓12.8%	11.376 ↓15.6%
Face + Eye	CA-Net	--	--	21.276	30.890
Face + Eye	CA-Net + PnP-GA	--	--	17.597 ↓17.3%	16.999 ↓44.9%
Face + Eye	Dilated-Net	--	--	16.683	18.996
Face + Eye	Dilated-Net + PnP-GA	--	--	15.461 ↓7.3%	16.835 ↓11.4%

This repository contains the official PyTorch implementation of the following paper:

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation
Yunfei Liu, Ruicong Liu, Haofei Wang, Feng Lu

Abstract: Deep neural networks have significantly improved appearance-based gaze estimation accuracy. However, it still suffers from unsatisfactory performance when generalizing the trained model to new domains, e.g., unseen environments or persons. In this paper, we propose a plugand-play gaze adaptation framework (PnP-GA), which is an ensemble of networks that learn collaboratively with the guidance of outliers. Since our proposed framework does not require ground-truth labels in the target domain, the existing gaze estimation networks can be directly plugged into PnP-GA and generalize the algorithms to new domains. We test PnP-GA on four gaze domain adaptation tasks, ETH-to-MPII, ETH-to-EyeDiap, Gaze360-to-MPII, and Gaze360-to-EyeDiap. The experimental results demonstrate that the PnP-GA framework achieves considerable performance improvements of 36.9%, 31.6%, 19.4%, and 11.8% over the baseline system. The proposed framework also outperforms the state-of-the-art domain adaptation approaches on gaze domain adaptation tasks.

Resources

Material related to our paper is available via the following links:

Paper: https://arxiv.org/abs/2107.13780
Project: https://liuyunfei.net/publication/iccv2021_pnp-ga/
Code: https://github.com/DreamtaleCore/PnP-GA

System requirements

Only Linux is tested, Windows is under test.
64-bit Python 3.6 installation.

Playing with pre-trained networks and training

Config

You need to modify the config.yaml first, especially xxx/image, xxx/label, and xxx_pretrains params.

xxx/image represents the path of label file.

xxx/root represents the path of image file.

xxx_pretrains represents the path of pretrained models.

A example of label file is data folder. Each line in label file is conducted as:

p00/face/1.jpg 0.2558059438789034,-0.05467275933864655 -0.05843388117618364,0.46745964684693614 ... ...

Where our code reads image data form os.path.join(xxx/root, "p00/face/1.jpg") and reads ground-truth labels of gaze direction from the rest in label file.

Train

We provide three optional arguments, which are --oma2, --js and --sg. They repersent three different network components, which could be found in our paper.

--source and --target represent the datasets used as the source domain and the target domain. You can choose among eth, gaze360, mpii, edp.

--i represents the index of person which is used as the training set. You can set it as -1 for using all the person as the training set.

--pics represents the number of target domain samples for adaptation.

We also provide other arguments for adjusting the hyperparameters in our PnP-GA architecture, which could be found in our paper.

For example, you can run the code like:

python3 adapt.py --i 0 --pics 10 --savepath path/to/save --source eth --target mpii --gpu 0 --js --oma2 --sg

Test

--i, --savepath, --target are the same as training.

--p represents the index of person which is used as the training set in the adaptation process.

For example, you can run the code like:

python3 test.py --i -1 --p 0 --savepath path/to/save --target mpii

Citation

If you find this work or code is helpful in your research, please cite:

@inproceedings{liu2021PnP_GA,
  title={Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation},
  author={Liu, Yunfei and Liu, Ruicong and Wang, Haofei and Lu, Feng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Contact

If you have any questions, feel free to E-mail me via: lyunfei(at)buaa.edu.cn

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Related tags

Overview

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Resources

System requirements

Playing with pre-trained networks and training

Config

Train

Test

Citation

Contact

Owner

Yunfei Liu

PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

CaLiGraph Ontology as a Challenge for Semantic Reasoners ([email protected]'21)

ADOP: Approximate Differentiable One-Pixel Point Rendering

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Title: Heart-Failure-Classification

Convolutional Neural Network for 3D meshes in PyTorch

Implementing DropPath/StochasticDepth in PyTorch

Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022

🥈78th place in Riiid Solution🥈

code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022

PyTorch ,ONNX and TensorRT implementation of YOLOv4

A PyTorch implementation of "Capsule Graph Neural Network" (ICLR 2019).

LabelImg is a graphical image annotation tool.

Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them

An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks