A curated list of programmatic weak supervision papers and resources

Overview

Awesome-Weak-Supervision Awesome

A curated list of programmatic/rule-based weak supervision papers and resources.

Contents

An overview of weak supervision

Blogs

An Overview of Weak Supervision

Building NLP Classifiers Cheaply With Transfer Learning and Weak Supervision

Videos

Theory & Systems for Weak Supervision | Chinese Version

Lecture Notes

Lecture Notes on Weak Supervision

Algorithm

Data Programming: Creating Large Training Sets, Quickly. Alex Ratner NeurIPS 2016

Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data. Paroma Varma FILM-NeurIPS 2016

Training Complex Models with Multi-Task Weak Supervision. Alex Ratner AAAI 2019

Data Programming using Continuous and Quality-Guided Labeling Functions. Oishik Chatterjee AAAI 2020

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods. Dan Fu ICML 2020

Learning from Rules Generalizing Labeled Exemplars. Abhijeet Awasthi ICLR 2020

Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings. Mayee F. Chen 2020

Learning the Structure of Generative Models without Labeled Data. Stephen H. Bach ICML 2017

Inferring Generative Model Structure with Static Analysis. Paroma Varma NeurIPS 2017

Learning Dependency Structures for Weak Supervision Models. Paroma Varma ICML 2019

Self-Training with Weak Supervision. Giannis Karamanolakis NAACL 2021

Interactive Programmatic Labeling for Weak Supervision. Benjamin Cohen-Wang KDD Workshop 2019

Pairwise Feedback for Data Programming. Benedikt Boecking NeurIPS 2019 workshop on Learning with Rich Experience: Integration of Learning Paradigms

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling. Benedikt Boecking ICLR 2021

Active WeaSuL: Improving Weak Supervision with Active Learning. Samantha Biegel ICLR WeaSuL 2021

System

Snorkel: Rapid Training Data Creation with Weak Supervision. Alex Ratner VLDB 2018

Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale. Stephen H. Bach SIGMOD (Industrial) 2019

Snuba: Automating Weak Supervision to Label Training Data. Paroma Varma VLDB 2019

Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design. Ying Sheng CIDR 2020

Overton: A Data System for Monitoring and Improving Machine-Learned Products. Christopher Ré CIDR 2020

Ruler: Data Programming by Demonstration for Document Labeling. Sara Evensen EMNLP 2020 Findings

skweak: Weak Supervision Made Easy for NLP. Pierre Lison 2021

Application

CV

Scene Graph Prediction with Limited Labels. Vincent Chen ICCV 2019

Multi-Resolution Weak Supervision for Sequential Data. Paroma Varma NeurIPS 2019

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels. Daniel Y. Fu SOSP 2019

GOGGLES: Automatic Image Labeling with Affinity Coding. Nilaksh Das SIGMOD 2020

Cut out the annotator, keep the cutout: better segmentation with weak supervision. Sarah Hooper ICLR 2021

Task Programming: Learning Data Efficient Behavior Representations. Jennifer J. Sun CVPR 2021

NLP

Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach. Liyuan Liu EMNLP 2017

Training Classifiers with Natural Language Explanations. Braden Hancock ACL 2018

Deep Text Mining of Instagram Data without Strong Supervision. Kim Hammar ICWI 2018

Bootstrapping Conversational Agents With Weak Supervision. Neil Mallinar AAAI 2019

Weakly Supervised Sequence Tagging from Noisy Rules. Esteban Safranchik AAAI 2020

NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction. Wenxuan Zhou WWW 2020

Named Entity Recognition without Labelled Data: A Weak Supervision Approach. Pierre Lison ACL 2020

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach. Yue Yu NAACL 2021

BERTifying Hidden Markov Models for Multi-Source Weakly Supervised Named Entity Recognition Yinghao Li ACL 2021

RL

Generating Multi-Agent Trajectories using Programmatic Weak Supervision. Eric Zhan ICLR 2019

Others

Generating Training Labels for Cardiac Phase-Contrast MRI Images. Vincent Chen MED-NeurIPS 2017

Osprey: Weak Supervision of Imbalanced Extraction Problems without Code. Eran Bringer SIGMOD DEEM Workshop 2019

Weakly Supervised Classification of Rare Aortic Valve Malformations Using Unlabeled Cardiac MRI Sequences. Jason Fries Nature Communications 2019

Doubly Weak Supervision of Deep Learning Models for Head CT. Khaled Saab MICCAI 2019

A clinical text classification paradigm using weak supervision and deep representation. Yanshan Wang BMC MIDM 2019

A machine-compiled database of genome-wide association studies. Volodymyr Kuleshov Nature Communications 2019

Weak Supervision as an Efficient Approach for Automated Seizure Detection in Electroencephalography. Khaled Saab NPJ Digital Medicine 2020

Extracting Chemical Reactions From Text Using Snorkel. Emily Mallory BMC Bioinformatics 2020

Cross-Modal Data Programming Enables Rapid Medical Machine Learning. Jared A. Dunnmon Patterns 2020

SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data. Jason Fries

Ontology-driven weak supervision for clinical entity classification in electronic health records. Jason Fries Nature Communications 2021

Utilizing Weak Supervision to Infer Complex Objects and Situations in Autonomous Driving Data. Zhenzhen Weng IV 2019

Multi-frame Weak Supervision to Label Wearable Sensor Data. Saelig Khattar ICML Time Series Workshop 2019

Thesis

Acclerating Machine Learning with Training Data Management. Alex Ratner

Weak Supervision From High-Level Abstrations. Braden Jay Hancock

Other Weak Supervision Paradigm

Label-name Only Supervision

Weakly-Supervised Neural Text Classification. Yu Meng CIKM 2018

Weakly-Supervised Hierarchical Text Classification. Yu Meng AAAI 2019

Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding. Jiaxin Huang EMNLP 2020

Text Classification Using Label Names Only: A Language Model Self-Training Approach. Yu Meng EMNLP 2020

Hierarchical Metadata-Aware Document Categorization under Weak Supervision. Yu Zhang WSDM 2021

Owner
Jieyu Zhang
CS PhD
Jieyu Zhang
Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

David Álvarez de la Torre 1 Dec 02, 2022
Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

Path-Generator-QA This is a Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Common

Peifeng Wang 33 Dec 05, 2022
AI grand challenge 2020 Repo (Speech Recognition Track)

KorBERT를 활용한 한국어 텍스트 기반 위협 상황인지(2020 인공지능 그랜드 챌린지) 본 프로젝트는 ETRI에서 제공된 한국어 korBERT 모델을 활용하여 폭력 기반 한국어 텍스트를 분류하는 다양한 분류 모델들을 제공합니다. 본 개발자들이 참여한 2020 인공지

Young-Seok Choi 23 Jan 25, 2022
Tensors and neural networks in Haskell

Hasktorch Hasktorch is a library for tensors and neural networks in Haskell. It is an independent open source community project which leverages the co

hasktorch 920 Jan 04, 2023
EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration Ruikang Xu, Zeyu Xiao, Jie Huang, Yueyi Zhang, Zhiwei Xiong. EDPN: Enhanced Deep Pyra

69 Dec 15, 2022
Trainable PyTorch reproduction of AlphaFold 2

OpenFold A faithful PyTorch reproduction of DeepMind's AlphaFold 2. Features OpenFold carefully reproduces (almost) all of the features of the origina

AQ Laboratory 1.7k Dec 29, 2022
MNIST, but with Bezier curves instead of pixels

bezier-mnist This is a work-in-progress vector version of the MNIST dataset. Samples Here are some samples from the training set. Note that, while the

Alex Nichol 15 Jan 16, 2022
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

This repository contains the SystemVerilog RTL, C++, HLS (Intel FPGA OpenCL to wrap RTL code) and Python needed to reproduce the numerical results in

Facebook Research 373 Dec 31, 2022
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 03, 2023
Simple implementation of Mobile-Former on Pytorch

Simple-implementation-of-Mobile-Former At present, only the model but no trained. There may be some bug in the code, and some details may be different

Acheung 103 Dec 31, 2022
Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit

streamlit-manim Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit Installation I had to install pango with sudo apt-get

Adrien Treuille 6 Aug 03, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows This repo contains the code for the paper Tractable Densit

Layer6 Labs 4 Dec 12, 2022
Systematic generalisation with group invariant predictions

Requirements are Python 3, TensorFlow v1.14, Numpy, Scipy, Scikit-Learn, Matplotlib, Pillow, Scikit-Image, h5py, tqdm. Experiments were run on V100 GPUs (16 and 32GB).

Faruk Ahmed 30 Dec 01, 2022
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
Introduction to CPM

CPM CPM is an open-source program on large-scale pre-trained models, which is conducted by Beijing Academy of Artificial Intelligence and Tsinghua Uni

Tsinghua AI 136 Dec 23, 2022
Multimodal commodity image retrieval 多模态商品图像检索

Multimodal commodity image retrieval 多模态商品图像检索 Not finished yet... introduce explain:The specific description of the project and the product image dat

hongjie 8 Nov 25, 2022
Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Leaded Gradient Method (LGM) This repository contains the PyTorch implementation for paper Dynamics-aware Adversarial Attack of 3D Sparse Convolution

An Tao 2 Oct 18, 2022
Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

Daniel Roich 58 Dec 24, 2022
The Most Efficient Temporal Difference Learning Framework for 2048

moporgic/TDL2048+ TDL2048+ is a highly optimized temporal difference (TD) learning framework for 2048. Features Many common methods related to 2048 ar

Hung Guei 5 Nov 23, 2022