Neural models of common sense. 🤖

Related tags

Deep Learningrainbow
Overview

Unicorn on Rainbow

Neural models of common sense.

This repository is for the paper: Unicorn on Rainbow: A Universal Commonsense Reasoning Model on a New Multitask Benchmark. Unicorn on Rainbow introduces a new evaluation, the cost equivalent curve, which compares models in terms of their cost-benefit trade offs. Using cost equivalent curves, we conduct a large-scale empirical study of intermediate-task transfer for common sense on a new benchmark collection of commonsense reasoning datasets, Rainbow. With findings from this study, we create a new state-of-the-art model for commonsense reasoning: Unicorn.

Jump to a section of the readme to accomplish different goals:

  • Rainbow: Read about and download data for Rainbow, our new commonsense reasoning benchmark.
  • Unicorn: Get up and running with Unicorn, our state-of-the-art commonsense reasoning model.
  • Cost Equivalent Curves: Learn how to generate cost equivalent curves for your own predictions.
  • Experimental Results: Download and analyze the results from our hundreds of experiments.
  • Setup: Get set up to run the code in this repository.
  • Quickstart: Run the code in this repo.
  • Citation: Cite the Unicorn on Rainbow paper.
  • Contact: Reach out with questions or comments.

Note: This repository is intended for research. There is no intention for ongoing maintenance.

Rainbow

Rainbow brings together six pre-existing commonsense reasoning benchmarks: aNLI, Cosmos QA, HellaSWAG, Physical IQa, Social IQa, and WinoGrande. These commonsense reasoning benchmarks span both social and physical common sense.

Note: Rainbow pins these datasets to specific versions. To make sure you're using the correct data, please download those versions below.

Getting the Data

Rainbow preprocesses all of the datasets into a text-to-text format for ease of modeling.

Alternatively, you can download the individual tasks and preprocess them yourself.

All checksums are sha256. To compute the checksum with openssl, run:

$ openssl sha256 $FILE_PATH

Submitting to the Leaderboard

If you develop a model for Rainbow, please feel free to submit to the leaderboard!

Unicorn

Unicorn (a UNIversal COmmonsense Reasoning Model) solves commonsense reasoning tasks in the text-to-text format. In principle, Unicorn may be trained on any NLP task, simply feed it text input and ask it to predict text output. Unicorn derives from T5, supercharging it for commonsense reasoning tasks and achieving state-of-the-art across a number of popular benchmarks, including Rainbow and CommonsenseQA.

To try Unicorn on your own data, first download the weights then fine-tune and evaluate it on your own data.

Downloading the Weights

To run Unicorn, you'll first need to download its weight files into a directory or path on Google Cloud. Using gsutil:

gsutil cp -r \
  gs://ai2-mosaic-public/projects/rainbow/v1.0/unicorns/lr-2e-3_batch-size-32
  $DST

Where $DST is the destination directory.

Reproducing our Results

In Unicorn on Rainbow, we trained different Unicorns that were first multitasked on Rainbow using different hyper-parameters. The checkpoint we've made available had the best performance most often. If you need the other checkpoints, please email the authors.

Cost Equivalent Curves

Cost equivalent curves compare the cost-benefit trade offs different techniques offer. In particular, cost equivalent curves plot the baseline and new technique's equivalent costs, or the costs where they achieve the same performance. For example, if the cost is measured as the number of examples and performance is measured by accuracy, then the cost equivalent curve shows how many examples the baseline needs to match the new technique's accuracy.

The plot_cost_equivalent_curves function in bin/create-multi-experiment-figures.py offers example code for how to create cost equivalent curves in Python.

Stay Tuned! We'll soon be releasing an easy-to-use, standalone package for creating cost equivalent curves. Check back here for it in the future.

Experimental Results

For Unicorn on Rainbow, we ran hundreds of experiments. We've made available the results from all those experiments in order to facilitate future research. For example, you may want those thousands of training curves to study hyper-parameter tuning or how loss evolves over training.

Among other things, you'll find:

  • predictions on dev from every checkpoint saved during training
  • training curves (training step vs. loss)
  • learning curves (dataset size vs. accuracy)
  • hyper-parameter tuning
  • all tables and figures from the paper
  • and more...

Our hope is that researchers can reuse this large collection of experiments to derive new practical and research insights.

Downloading the Results

Five collections of results are available:

All checksums are sha256. To compute the checksum with openssl, run:

$ openssl sha256 $FILE_PATH

NOTE: The learning curves experiments varied the number of training examples up to 16,000; however, CommonsenseQA has fewer than 16,000 training examples. Thus, for CommonsenseQA numbers higher than 9,741 are truncated to that size. This subtlety is taken care of by the data processing pipeline when the experiments are processed into the results tables, so it only affects rainbow-predictions.tar.gz and rainbow-experiments.tar.gz.

Replicating Our Analysis Pipeline

All the scripts to replicate our analysis pipeline reside in bin/. In order to run the scripts, you'll need to get set up for development.

The overall pipeline is as follows:

+----------------------------+
| rainbow-predictions.tar.gz |
+----------------------------+
              |
              | (bin/organize-experiments)
              V
+----------------------------+
| rainbow-experiments.tar.gz |
+----------------------------+
              |
              | (bin/generate-tables.py)
              V
  +------------------------+
  | rainbow-results.tar.gz |
  +------------------------+
         |         |
         |         | (bin/generate-latex-tables.py)
         |         V
         |     +-----------------------------+
         |     | rainbow-latex-tables.tar.gz |
         |     +-----------------------------+
         |
         | (bin/create-single-experiment-figures.py)
         | (bin/create-multi-experiment-figures.py)
         V
+------------------------+
| rainbow-figures.tar.gz |
+------------------------+

To run the pipeline, start by downloading rainbow-predictions.tar.gz (see Downloading the Results above).

Use bin/organize-experiments to produce rainbow-experiments.tar.gz:

$ tar -xf rainbow-predictions.tar.gz
$ bin/organize-experiments rainbow-predictions $DST

Where $DST is the desired destination directory (for example the current directory, .).

Use bin/generate-tables.py to produce rainbow-results.tar.gz:

$ bin/generate-tables.py rainbow-experiments rainbow-results

Use bin/create-single-experiment-figures.py and bin/create-multi-experiment-figures.py to create rainbow-figures.tar.gz:

$ bin/create-single-experiment-figures.py rainbow-results rainbow-figures/single-experiment
$ bin/create-multi-experiment-figures.py rainbow-results rainbow-figures/multi-experiment

And use bin/generate-latex-tables.py to produce rainbow-latex-tables.tar.gz:

$ bin/generate-latex-tables.py rainbow-results rainbow-latex-tables

All scripts except bin/organize-experiments are also self-documenting, so pass --help to any of them for more information.

Setup

This project requires Python 3.6 or above.

First, install the project's dependencies:

./bin/install

Next, make sure you have the following environment variables set:

  1. RAINBOW_DATASETS_DIR: The directory for storing all relevant datasets.
  2. RAINBOW_PREPROCESSED_DATASETS_DIR: The directory for storing the preprocessed dataset split files.
  3. RAINBOW_TFDS_DATASETS_DIR: The directory for storing the TFDS (tensorflow datasets) datasets.

Training requires TPUs. For training, all directories should point to Google Cloud Storage prefixes. Additionally, you'll need the following environment variables:

  1. PROJECT: Your Google Cloud project's ID.
  2. ZONE: Your Google Cloud virtual machine's zone.
  3. TPU_NAME: Your TPU's name.
  4. TPU_TOPOLOGY: Your TPU's topology.

Then, download and prepare all the datasets for text-to-text modeling:

$ ./bin/prepare.py --help
Usage: prepare.py [OPTIONS]

  Prepare all relevant datasets for text-to-text modeling.

  Download to and read the datasets from --src, transform them into CSVs
  suitable for text-to-text models, then write the results to --dst. Google
  storage paths are supported.

Options:
  --src TEXT        The directory to which to download all the relevant
                    datasets. Defaults to the RAINBOW_DATASETS_DIR environment
                    variable.  [required]
  --dst TEXT        The directory to which to write the preprocessed dataset
                    files. Defaults to the RAINBOW_PREPROCESSED_DATASETS_DIR
                    environment variable.  [required]
  --force-download  Force downloads of all the datasets, otherwise only
                    missing datasets will be downloaded.
  --help            Show this message and exit.

Finally, verify your installation:

./bin/verify

Quickstart

Before following this section, make sure you've done the Setup.

Fine-tuning

To fine-tune the model, use bin/fine-tune.py:

$ ./bin/fine-tune.py --help
Usage: fine-tune.py [OPTIONS] MIXTURE RESULTS_DIR

  Fine-tune the model on MIXTURE, writing results to RESULTS_DIR.

Options:
  --pretrained-model TEXT         The path to or name of the pretrained model.
                                  Defaults to 3B.
  --n-steps INTEGER               The number of gradient updates. Defaults to
                                  25,000.
  --learning-rate FLOAT           The learning rate to use for training.
                                  Defaults to 3e-3.
  --batch-size INTEGER            The batch size to use for training. For
                                  efficient training on the TPU, choose a
                                  multiple of either 8 or 128. Defaults to 16.
  --model-parallelism INTEGER     The degree of model parallelism to use.
                                  Defaults to 8.
  --save-checkpoints-steps INTEGER
                                  The number of steps to take before saving a
                                  checkpoint. Defaults to 5000.
  --n-checkpoints-to-keep INTEGER
                                  The number of checkpoints to keep during
                                  fine-tuning. Defaults to 4.
  --tpu-name TEXT                 The name of the TPU. Defaults to the
                                  TPU_NAME environment variable.  [required]
  --tpu-topology TEXT             The topology of the TPU. Defaults to the
                                  TPU_TOPOLOGY environment variable.
                                  [required]
  --help                          Show this message and exit.

Evaluation

To evaluate the model, use bin/evaluate.py:

$ ./bin/evaluate.py --help
Usage: evaluate.py [OPTIONS] MIXTURE RESULTS_DIR

  Evaluate the model located at RESULTS_DIR on MIXTURE.

Options:
  --batch-size INTEGER         The batch size to use for prediction. For
                               efficient prediction on the TPU, choose a
                               multiple of either 8 or 128. Defaults to 64.
  --model-parallelism INTEGER  The degree of model parallelism to use.
                               Defaults to 8.
  --tpu-name TEXT              The name of the TPU. Defaults to the TPU_NAME
                               environment variable.  [required]
  --tpu-topology TEXT          The topology of the TPU. Defaults to the
                               TPU_TOPOLOGY environment variable.  [required]
  --help                       Show this message and exit.

Tests and Code Quality

The code is formatted with black. You can run the formatter using the bin/format script:

$ ./bin/format

To run code quality checks, use the bin/verify script:

$ ./bin/verify

For fine-grained control of which tests to run, use pytest directly:

$ pytest

You can also skip slower tests by passing the --skip-slow (-s) flag:

$ pytest --skip-slow

Citation

Unicorn on Rainbow is a AAAI 2021 paper. Please check back here soon for the bibtex citation.

Contact

For public, non-sensitive questions and concerns, please file an issue on this repository.

For private or sensitive inquiries email mosaic on the allenai.org website.

Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

DV Lab 156 Dec 21, 2022
Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

diffusion-channels Source code for paper "Deep Diffusion Models for Robust Channel Estimation". Generic flow: Use 'matlab/main.mat' to generate traini

The University of Texas Computational Sensing and Imaging Lab 15 Dec 22, 2022
Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

MeMOT - Pytorch (wip) Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch. This paper is just one in a line of work, but importan

Phil Wang 15 May 09, 2022
Code repo for "FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation" (ICCV 2021)

FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation (ICCV 2021) This repository contains the implementation of th

Yuhang Zang 21 Dec 17, 2022
novel deep learning research works with PaddlePaddle

Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa

1.5k Dec 29, 2022
A demo of how to use JAX to create a simple gravity simulation

JAX Gravity This repo contains a demo of how to use JAX to create a simple gravity simulation. It uses JAX's experimental ode package to solve the dif

Cristian Garcia 16 Sep 22, 2022
Lane assist for ETS2, built with the ultra-fast-lane-detection model.

Euro-Truck-Simulator-2-Lane-Assist Lane assist for ETS2, built with the ultra-fast-lane-detection model. This project was made possible by the amazing

36 Jan 05, 2023
Recognize Handwritten Digits using Deep Learning on the browser itself.

MNIST on the Web An attempt to predict MNIST handwritten digits from my PyTorch model from the browser (client-side) and not from the server, with the

Harjyot Bagga 7 May 28, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

DamoCV 569 Dec 30, 2022
Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks.

Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks. Generally, we intergrete different kind of functional

28 Jan 08, 2023
A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

PiSL A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery. Sun, F., Liu, Y. and Sun, H., 2021. Physics-informe

Fangzheng (Andy) Sun 8 Jul 13, 2022
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

HugsVision is an open-source and easy to use all-in-one huggingface wrapper for computer vision. The goal is to create a fast, flexible and user-frien

Labrak Yanis 166 Nov 27, 2022
Generic Foreground Segmentation in Images

Pixel Objectness The following repository contains pretrained model for pixel objectness. Please visit our project page for the paper and visual resul

Suyog Jain 157 Nov 21, 2022
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image [Project Page] [Paper] [Supp. Mat.] Table of Contents License Description Fittin

Vassilis Choutas 1.3k Jan 07, 2023
Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC 6 Nov 29, 2022
SMPLpix: Neural Avatars from 3D Human Models

subject0_validation_poses.mp4 Left: SMPL-X human mesh registered with SMPLify-X, middle: SMPLpix render, right: ground truth video. SMPLpix: Neural Av

Sergey Prokudin 292 Dec 30, 2022
The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

HCQ_Tweet_Dataset: FREE to Download. Keywords: HCQ, hydroxychloroquine, tweet, twitter, COVID-19 This dataset is associated with the paper "Understand

2 Mar 16, 2022
Model-based Reinforcement Learning Improves Autonomous Racing Performance

Racing Dreamer: Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars In this work, we propose to learn a racing contro

Cyber Physical Systems - TU Wien 38 Dec 06, 2022
A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

Yu-Hsiang Huang 7.1k Jan 04, 2023
Pytorch implementation of the paper "Optimization as a Model for Few-Shot Learning"

Optimization as a Model for Few-Shot Learning This repo provides a Pytorch implementation for the Optimization as a Model for Few-Shot Learning paper.

Albert Berenguel Centeno 238 Jan 04, 2023