Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Related tags

DocumentationP3Ranker
Overview

P3 Ranker

Implementation for our SIGIR2022 accepted paper:

P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Project Structures

├── commands
│   ├── bert.sh
│   ├── p3ranker.sh
│   ├── prop_ft.sh
│   ├── roberta.sh
│   └── t5v11.sh
├── Prefinetune
│   ├── mnli_dataloader.py
│   ├── mnli_dataset.py
│   ├── mnli_model.py
│   ├── README.md
│   ├── train_mnli.sh
│   ├── train_nq.sh
│   ├── train.py
│   └── utils.py
├── src
│   ├── data
│   │    ├── datasets
│   │    │   ├── __init__.py
│   │    │   ├── bert_dataset.py
│   │    │   ├── bertmaxp_dataset.py
│   │    │   ├── dataset.py
│   │    │   ├── edrm_dataset.py
│   │    │   ├── roberta_dataset.py
│   │    │   └── t5_dataset.py
│   │    └── tokenizers
│   │        ├── __init__.py
│   │        ├── tokenizer.py
│   │        └── word_tokenizer.py
│   ├── extractors
│   │    ├── __init__.py
│   │    └── classic_extractor.py
│   ├── metrics
│   │    ├── __init__.py
│   │    └── metric.py
│   ├── models
│   │    ├── __init__.py
│   │    ├── bert_maxp.py
│   │    ├── bert_prompt_.py
│   │    ├── bert.py
│   │    ├── conv_knrm.py
│   │    ├── edrm.py
│   │    ├── knrm.py
│   │    ├── t5.py
│   │    └── tk.py
│   ├── modules
│   │    ├── attentons
│   │    │   ├── __init__.py
│   │    │   ├── multi_head_attention.py
│   │    │   └── scaled_dot_product_attention.py
│   │    ├── embedders
│   │    │   ├── __init__.py
│   │    │   └── embedder.py
│   │    ├── encoders
│   │    │   ├── __init__.py
│   │    │   ├── cnn_encoder.py
│   │    │   ├── feed_forward_encoder.py
│   │    │   ├── positional_encoder.py
│   │    │   └── transformer_encoder.py
│   │    └── matchers
│   │        ├── __init__.py
│   │        └── kernel_matcher.py
│   ├── __init__.py
│   └── utils.py
├── README.md
├── requirements.txt
├── train.py
└── utils.py 

Prerequisites

Install dependencies:

git clone https://github.com/NEUIR/P3Ranker.git
cd P3-Rankers
pip install -r requirements.txt

Data Preparation

We will release our few-shot dataset soon.

Prompt Generation

Details about the Discrete Prompt Generation can be find in https://github.com/princeton-nlp/LM-BFF and our paper

Prefinetune

cd Reproduce

And you will find how to do prefinetune.

Reproduce our results

Directly run the scripts we stored in './commands' can reproduce our results. One example is shown below:

bash commands/bert.sh 5

The above command is for reproducing results in our 5-q few-shot scenarios mentioned in our paper.

Contact

Please send email to [email protected].

Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te

Ram Seshadri 6 Oct 02, 2022
Quickly download, clean up, and install public datasets into a database management system

Finding data is one thing. Getting it ready for analysis is another. Acquiring, cleaning, standardizing and importing publicly available data is time

Weecology 274 Jan 04, 2023
Projeto em Python colaborativo para o Bootcamp de Dados do Itaú em parceria com a Lets Code

🧾 lets-code-todo-list por Henrique V. Domingues e Josué Montalvão Projeto em Python colaborativo para o Bootcamp de Dados do Itaú em parceria com a L

Henrique V. Domingues 1 Jan 11, 2022
Convenient tools for using Swagger to define and validate your interfaces in a Pyramid webapp.

Convenient tools for using Swagger to define and validate your interfaces in a Pyramid webapp.

Scott Triglia 64 Sep 18, 2022
Software engineering course project. Secondhand trading system.

PigeonSale Software engineering course project. Secondhand trading system. Documentation API doumenatation: list of APIs Backend documentation: notes

Harry Lee 1 Sep 01, 2022
A Python module for creating Excel XLSX files.

XlsxWriter XlsxWriter is a Python module for writing files in the Excel 2007+ XLSX file format. XlsxWriter can be used to write text, numbers, formula

John McNamara 3.1k Dec 29, 2022
Count the number of lines of code in a directory, minus the irrelevant stuff

countloc Simple library to count the lines of code in a directory (excluding stuff like node_modules) Simply just run: countloc node_modules args to

Anish 4 Feb 14, 2022
A simple document management REST based API for collaboratively interacting with documents

documan_api A simple document management REST based API for collaboratively interacting with documents.

Shahid Yousuf 1 Jan 22, 2022
Speed up Sphinx builds by selectively removing toctrees from some pages

Remove toctrees from Sphinx pages Improve your Sphinx build time by selectively removing TocTree objects from pages. This is useful if your documentat

Executable Books 8 Jan 04, 2023
Use Brainf*ck with python!

Brainfudge Run Brainf*ck code with python! Classes Interpreter(array_len): encapsulate all functions into class __init__(self, array_len: int=30000) -

1 Dec 14, 2021
Contains the assignments from the course Building a Modern Computer from First Principles: From Nand to Tetris.

Contains the assignments from the course Building a Modern Computer from First Principles: From Nand to Tetris.

Matheus Rodrigues 1 Jan 20, 2022
A collection of online resources to help you on your Tech journey.

Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di

Mohamed A 396 Dec 31, 2022
Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized; as this infor

Vibhav Kumar Dixit 2 Jul 18, 2022
Mkdocs obsidian publish - Publish your obsidian vault through a python script

Mkdocs Obsidian Mkdocs Obsidian is an association between a python script and a

Mara 49 Jan 09, 2023
An MkDocs plugin that simplifies configuring page titles and their order

MkDocs Awesome Pages Plugin An MkDocs plugin that simplifies configuring page titles and their order The awesome-pages plugin allows you to customize

Lukas Geiter 282 Dec 28, 2022
Source Code for 'Practical Python Projects' (video) by Sunil Gupta

Apress Source Code This repository accompanies %Practical Python Projects by Sunil Gupta (Apress, 2021). Download the files as a zip using the green b

Apress 2 Jun 01, 2022
A document format conversion service based on Pandoc.

reformed Document format conversion service based on Pandoc. Usage The API specification for the Reformed server is as follows: GET /api/v1/formats: L

David Lougheed 3 Jul 18, 2022
MkDocs plugin for setting revision date from git per markdown file

mkdocs-git-revision-date-plugin MkDocs plugin that displays the last revision date of the current page of the documentation based on Git. The revision

Terry Zhao 48 Jan 06, 2023
Python-slp - Side Ledger Protocol With Python

Side Ledger Protocol Run python-slp node First install Mongo DB and run the mong

Solar 3 Mar 02, 2022
Convert excel xlsx file's table to csv file, A GUI application on top of python/pyqt and other opensource softwares.

Convert excel xlsx file's table to csv file, A GUI application on top of python/pyqt and other opensource softwares.

David A 0 Jan 20, 2022