Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

Related tags

Deep LearningPLBART
Overview

PLBART

Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021.

Note. A detailed documentation is coming soon.

Pre-training data

PLBART is pre-trained on Java and Python functions and natural language descriptions collected from Github and StackOverflow.

Evaluation tasks

We evaluated PLBART on five tasks.

  • Code summarization [REF]
  • Code generation [REF]
  • Code translation [REF]
  • Clone detection [REF]
  • Vulnerability REF [REF]

Notes

  • We will publish the pretrained PLBART checkpoint soon.
  • We list all the files in this repository here.

Acknowledgement

PLBART uses Fairseq, codeXglue, and TransCoder and thanks the authors of these works for their contribution.

Citation

@inproceedings{ahmad2020summarization,
    author = {Ahmad, Wasi Uddin and Chakraborty, Saikat and Ray, Baishakhi and Chang, Kai-Wei},
    booktitle = {Proceedings of the 2021 Conference of the North {A}merican Chapter of the Association for Computational Linguistics},
    title = {Unified Pre-training for Program Understanding and Generation},
    year = {2021}
}
Owner
Wasi Ahmad
I am a Ph.D. student in CS at UCLA.
Wasi Ahmad
A modification of Daniel Russell's notebook merged with Katherine Crowson's hq-skip-net changes

Edits made to this repo by Katherine Crowson I have added several features to this repository for use in creating higher quality generative art (featu

Paul Fishwick 10 May 07, 2022
Simple image captioning model - CLIP prefix captioning.

Simple image captioning model - CLIP prefix captioning.

688 Jan 04, 2023
This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

Gregory Duthé 4 Nov 24, 2022
N-gram models- Unsmoothed, Laplace, Deleted Interpolation

N-gram models- Unsmoothed, Laplace, Deleted Interpolation

Ravika Nagpal 1 Jan 04, 2022
List of papers, code and experiments using deep learning for time series forecasting

Deep Learning Time Series Forecasting List of state of the art papers focus on deep learning and resources, code and experiments using deep learning f

Alexander Robles 2k Jan 06, 2023
Easy and comprehensive assessment of predictive power, with support for neuroimaging features

Documentation: https://raamana.github.io/neuropredict/ News As of v0.6, neuropredict now supports regression applications i.e. predicting continuous t

Pradeep Reddy Raamana 93 Nov 29, 2022
labelpix is a graphical image labeling interface for drawing bounding boxes

Welcome to labelpix 👋 labelpix is a graphical image labeling interface for drawing bounding boxes. 🏠 Homepage Install pip install -r requirements.tx

schissmantics 26 May 24, 2022
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022
An implementation of the proximal policy optimization algorithm

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology

A310 Computational Neuroscience - Okinawa Institute of Science and Technology, 2022 This repository contains modeling practice materials and homework

Sungho Hong 1 Jan 24, 2022
This repository is based on Ultralytics/yolov5, with adjustments to enable polygon prediction boxes.

Polygon-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable polygon prediction boxes. Section I. Description The codes a

xinzelee 226 Jan 05, 2023
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset

Ashish 1 Jan 30, 2022
This is the repo for Uncertainty Quantification 360 Toolkit.

UQ360 The Uncertainty Quantification 360 (UQ360) toolkit is an open-source Python package that provides a diverse set of algorithms to quantify uncert

International Business Machines 207 Dec 30, 2022
PyTorch Code for "Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning"

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning [Project Page] [Paper] Wenlong Huang1, Igor Mordatch2, Pieter Abbeel1,

Wenlong Huang 40 Nov 22, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Repository of best practices for deep learning in Julia, inspired by fastai

FastAI Docs: Stable | Dev FastAI.jl is inspired by fastai, and is a repository of best practices for deep learning in Julia. Its goal is to easily ena

FluxML 532 Jan 02, 2023
Rethinking Nearest Neighbors for Visual Classification

Rethinking Nearest Neighbors for Visual Classification arXiv Environment settings Check out scripts/env_setup.sh Setup data Download the following fin

Menglin Jia 29 Oct 11, 2022
Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon! 💡 Collated best practices from most p

4 Jun 26, 2022
PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

EML Tübingen 70 Dec 27, 2022
A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi

LSTM-Time-Series-Prediction A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi Contest. The Link of the Cont

KevinCHEN 1 Jun 13, 2022