NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Last update: Nov 15, 2022

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

This repository contains code and pre-trained models for our NAACL-2022 paper MCSE: Multimodal Contrastive Learning of Sentence Embeddings. If you find this reposity useful, please consider citing our paper.

Contact: Miaoran Zhang ([email protected])

Pre-trained Models & Results

Model	Avg. STS
flickr-mcse-bert-base-uncased [Google Drive]	77.70
flickr-mcse-roberta-base [Google Drive]	78.44
coco-mcse-bert-base-uncased [Google Drive]	77.08
coco-mcse-roberta-base [Google Drive]	78.17

Note: flickr indicates that models are trained on wiki+flickr, and coco indicates that models are trained on wiki+coco.

Quickstart

Setup

Python 3.9.5
Pytorch 1.7.1
Install other packages:

pip install -r requirements.txt

Data Preparation

Please organize the data directory as following:

REPO ROOT
|
|--data    
|  |--wiki1m_for_simcse.txt  
|  |--flickr_random_captions.txt    
|  |--flickr_resnet.hdf5    
|  |--coco_random_captions.txt    
|  |--coco_resnet.hdf5

Wiki1M

wget https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt

Flickr30k & MS-COCO
You can either download the preprocessed data we used:
(annotation sources: flickr30k-entities and coco).

Or preprocess the data by yourself (take Flickr30k as an example):

Download the flickr30k-entities.
Request access to the flickr-images from here. Note that the use of the images much abide by the Flickr Terms of Use.

Run script:

unzip ${path_to_flickr-entities}/annotations.zip

python preprocess/prepare_flickr.py \
    --flickr_entities_dir ${path_to_flickr-entities}  \  
    --flickr_images_dir ${path_to_flickr-images} \
    --output_dir data/
    --batch_size 32

Train & Evaluation

Prepare the senteval datasets for evaluation:

cd SentEval/data/downstream/
bash download_dataset.sh

Run scripts:
```
# For example:  (more examples are given in scripts/.)
sh scripts/run_wiki_flickr.sh
```
Note: In the paper we run experiments with 5 seeds (0,1,2,3,4). You can find the detailed parameter settings in Appendix.

Acknowledgements

The extremely clear and well organized codebase: SimCSE
SentEval toolkit

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Pre-trained Models & Results

Quickstart

Setup

Data Preparation

Train & Evaluation

Acknowledgements

Owner

Saarland University Spoken Language Systems Group

Simple Text-To-Speech Bot For Discord

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

A natural language modeling framework based on PyTorch

This program do translate english words to portuguese

An evaluation toolkit for voice conversion models.

Blue Brain text mining toolbox for semantic search and structured information extraction

Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

PyTorch impelementations of BERT-based Spelling Error Correction Models.

A fast hierarchical dimensionality reduction algorithm.

Automatic privilege escalation for misconfigured capabilities, sudo and suid binaries

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

Understand Text Summarization and create your own summarizer in python

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

硕士期间自学的NLP子任务，供学习参考

neural network based speaker embedder

Tools, wrappers, etc... for data science with a concentration on text processing

Chinese Pre-Trained Language Models (CPM-LM) Version-I

Text classification on IMDB dataset using Keras and Bi-LSTM network