Unbiased Learning To Rank Algorithms (ULTRA)

Overview
logo

Unbiased Learning to Rank Algorithms (ULTRA)

Python 3.6 Documentation Status Build Status codecov License follow on Twitter

🔥 News: A TensorFlow version of this package can be found in ULTRA.

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels. With the unified data processing pipeline, ULTRA supports multiple unbiased learning-to-rank algorithms, online learning-to-rank algorithms, neural learning-to-rank models, as well as different methods to use and simulate noisy labels (e.g., clicks) to train and test different algorithms/ranking models. A user-friendly documentation can be found here.

Get Started

Create virtual environment (optional):

pip install --user virtualenv
~/.local/bin/virtualenv -p python3 ./venv
source venv/bin/activate

Install ULTRA from the source:

git clone https://github.com/ULTR-Community/ULTRA_pytorch.git
cd ULTRA
make init

Run toy example:

bash example/toy/offline_exp_pipeline.sh

Structure

structure

Input Layers

  1. ClickSimulationFeed: this is the input layer that generate synthetic clicks on fixed ranked lists to feed the learning algorithm.

  2. DeterministicOnlineSimulationFeed: this is the input layer that first create ranked lists by sorting documents according to the current ranking model, and then generate synthetic clicks on the lists to feed the learning algorithm. It can do result interleaving if required by the learning algorithm.

  3. StochasticOnlineSimulationFeed: this is the input layer that first create ranked lists by sampling documents based on their scores in the current ranking model and the Plackett-Luce distribution, and then generate synthetic clicks on the lists to feed the learning algorithm. It can do result interleaving if required by the learning algorithm.

  4. DirectLabelFeed: this is the input layer that directly feed the true relevance labels of each documents to the learning algorithm.

Learning Algorithms

  1. NA: this model is an implementation of the naive algorithm that directly train models with input labels (e.g., clicks).

  2. DLA: this is an implementation of the Dual Learning Algorithm in Unbiased Learning to Rank with Unbiased Propensity Estimation.

  3. IPW: this model is an implementation of the Inverse Propensity Weighting algorithms in Learning to Rank with Selection Bias in Personal Search and Unbiased Learning-to-Rank with Biased Feedback

  4. REM: this model is an implementation of the regression-based EM algorithm in Position bias estimation for unbiased learning to rank in personal search

  5. PD: this model is an implementation of the pairwise debiasing algorithm in Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.

  6. DBGD: this model is an implementation of the Dual Bandit Gradient Descent algorithm in Interactively optimizing information retrieval systems as a dueling bandits problem

  7. MGD: this model is an implementation of the Multileave Gradient Descent in Multileave Gradient Descent for Fast Online Learning to Rank

  8. NSGD: this model is an implementation of the Null Space Gradient Descent algorithm in Efficient Exploration of Gradient Space for Online Learning to Rank

  9. PDGD: this model is an implementation of the Pairwise Differentiable Gradient Descent algorithm in Differentiable unbiased online learning to rank

Ranking Models

  1. Linear: this is a linear ranking algorithm that compute ranking scores with a linear function.

  2. DNN: this is neural ranking algorithm that compute ranking scores with a multi-layer perceptron network (with non-linear activation functions).

  3. DLCM: this is an implementation of the Deep Listwise Context Model in Learning a Deep Listwise Context Model for Ranking Refinement (TODO).

  4. GSF: this is an implementation of the Groupwise Scoring Function in Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks (TODO).

  5. SetRank: this is an implementation of the SetRank model in SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval (TODO).

Supported Evaluation Metrics

  1. MRR: the Mean Reciprocal Rank.

  2. ERR: the Expected Reciprocal Rank from Expected reciprocal rank for graded relevance.

  3. ARP: the Average Relevance Position.

  4. NDCG: the Normalized Discounted Cumulative Gain.

  5. DCG: the Discounted Cumulative Gain.

  6. Precision: the Precision.

  7. MAP: the Mean Average Precision.

  8. Ordered_Pair_Accuracy: the percentage of correctedly ordered pair.

Click Simulation Example

Create click models for click simulations

python ultra/utils/click_models.py pbm 0.1 1 4 1.0 example/ClickModel

* The output is a json file containing the click mode that could be used for click simulation. More details could be found in the code.

(Optional) Estimate examination propensity with result randomization

python ultra/utils/propensity_estimator.py example/ClickModel/pbm_0.1_1.0_4_1.0.json 
   
     example/PropensityEstimator/

   

* The output is a json file containing the estimated examination propensity (used for IPW). DATA_DIR is the directory for the prepared data created by ./libsvm_tools/prepare_exp_data_with_svmrank.py. More details could be found in the code.

Citation

If you use ULTRA in your research, please use the following BibTex entry.

@misc{tran2021ultra,
      title={ULTRA: An Unbiased Learning To Rank Algorithm Toolbox}, 
      author={Anh Tran and Tao Yang and Qingyao Ai},
      year={2021},
      eprint={2108.05073},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}

@article{10.1145/3439861,
author = {Ai, Qingyao and Yang, Tao and Wang, Huazheng and Mao, Jiaxin},
title = {Unbiased Learning to Rank: Online or Offline?},
year = {2021},
issue_date = {February 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {39},
number = {2},
issn = {1046-8188},
url = {https://doi.org/10.1145/3439861},
doi = {10.1145/3439861},
journal = {ACM Trans. Inf. Syst.},
month = feb,
articleno = {21},
numpages = {29},
keywords = {unbiased learning, online learning, Learning to rank}
}

Development Team

​ ​ ​ ​

QingyaoAi
Qingyao Ai

Core Dev
ASST PROF, Univ. of Utah

anhtran1010
Anh Tran

Core Dev
Ph.D., Univ. of Utah

Taosheng-ty
Tao Yang

Core Dev
Ph.D., Univ. of Utah

huazhengwang
Huazheng Wang

Core Dev
Ph.D., Univ. of Virginia

defaultstr
Jiaxin Mao

Core Dev
ASST PROF, Renmin Univ.

Contribution

Please read the Contributing Guide before creating a pull request.

Project Organizers

  • Qingyao Ai
    • School of Computing, University of Utah
    • Homepage

License

Apache-2.0

Copyright (c) 2020-present, Qingyao Ai (QingyaoAi) "# Pytorch_ULTRA"

Owner
Facilitating the design, comparison and sharing of unbiased and online learning to rank algorithms.
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 07, 2022
A library for low-memory inferencing in PyTorch.

Pylomin Pylomin (PYtorch LOw-Memory INference) is a library for low-memory inferencing in PyTorch. Installation ... Usage For example, the following c

3 Oct 26, 2022
Residual Dense Net De-Interlace Filter (RDNDIF)

Residual Dense Net De-Interlace Filter (RDNDIF) Work in progress deep de-interlacer filter. It is based on the architecture proposed by Bernasconi et

Louis 7 Feb 15, 2022
The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper.

Intermdiate layer matters - SSL The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper. Downl

Aakash Kaku 35 Sep 19, 2022
Multiview 3D object detection on MultiviewC dataset through moft3d.

Voxelized 3D Feature Aggregation for Multiview Detection [arXiv] Multiview 3D object detection on MultiviewC dataset through VFA. Introduction We prop

Jiahao Ma 20 Dec 21, 2022
Proof of concept GnuCash Webinterface

Proof of Concept GnuCash Webinterface This may one day be a something truly great. Milestones [ ] Browse accounts and view transactions [ ] Record sim

Josh 14 Dec 28, 2022
Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction

RE results graph visualization and company clustering Installation pip install -r requirements.txt python -m nltk.downloader stopwords python3.7 main.

Jieun Han 1 Oct 06, 2022
This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

9 Sep 01, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Jianfei Guo 239 Dec 22, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

19 Sep 29, 2022
Raindrop strategy for Irregular time series

Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for

Zitnik Lab @ Harvard 74 Jan 03, 2023
Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,

O-OFDMNet This includes Python implementation of O-OFDMNet, a deep learning-based optical OFDM system, which uses neural networks for signal processin

Thien Luong 4 Sep 09, 2022
Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training Code for our paper "Predicting lncRNA–protein interactio

zhanglabNKU 1 Nov 29, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Libo Qin 25 Sep 06, 2022
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow 18 Oct 06, 2022
Object detection on multiple datasets with an automatically learned unified label space.

Simple multi-dataset detection An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of E

Xingyi Zhou 407 Dec 30, 2022
Toontown: Galaxy, a new Toontown game based on Disney's Toontown Online

Toontown: Galaxy The official archive repo for Toontown: Galaxy, a new Toontown

1 Feb 15, 2022
Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow

ddpg-aigym Deep Deterministic Policy Gradient Implementation of Deep Deterministic Policy Gradiet Algorithm (Lillicrap et al.arXiv:1509.02971.) in Ten

Steven Spielberg P 247 Dec 07, 2022
Image De-raining Using a Conditional Generative Adversarial Network

Image De-raining Using a Conditional Generative Adversarial Network [Paper Link] [Project Page] He Zhang, Vishwanath Sindagi, Vishal M. Patel In this

He Zhang 216 Dec 18, 2022
Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains

Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains This is an accompanying repository to the ICAIL 2021 pap

4 Dec 16, 2021