This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

Related tags

Deep LearningZaCQ
Overview

Clarifying Questions for Query Refinement in Source Code Search

This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

It consists of five folders:

  • codesearch/ - API to access the CodeSearchNet datasets and neural bag-of-words code retrieval method.

  • cq/ - Implementation of the ZaCQ system, including an implementation of the the TaskNav development task extraction algorithm and two baseline query refinement methods.

  • data/ - Includes pretrained code search model and config files for task extraction.

  • evaluation/ - Scripts to run and evaluate ZaCQ.

  • interface/ - Backend and Frontend servers for a search interface implementing ZaCQ.

Setup

  1. Clone the CodeSearchNet package to the root directory, and download the CSN datasets
cd ZaCQ
git clone https://github.com/github/CodeSearchNet.git
cd CodeSearchNet/scripts
./download_and_preprocess
  1. Use a CSN model to create vector representations for candidate code search results. A pretrained Neural BoW model is included in this package.
cd codesearch
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python _setup.py

This will save and index vectors in the data folder. It will also generate search results for the 99 CSN queries.

  1. Task extraction is fairly quick for small sets of code search results, but it is expensive to do repeatedly. To expedite the evaluation, we cache the extracted tasks for the results of the 99 CSN queries, as well as keywords for all functions in the datasets.
cd cq
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python _setup.py

Cached tasks and keywords are stored in the data folder.

Evaluation

To evaluate the ZaCQ and the other query refinement methods on the CSN queries, you may use the following:

cd evaluation
python run_queries.py
python evaluate.py

The run_queries script determines the subset of CSN queries that can be automatically evaluated, and simulates interactive refinement sessions for all valid questions for each language in CSN. For ZaCQ, the script runs through a set of predefined hyperparameter combinations. The script calculates NDCG, MAP, and MRE metrics for each refinement method and hyperparameter configuration, and stores them in the data/output folder

The evaluate script averages the metrics across all languages after 1-N rounds of refinement. For ZaCQ, it also records the best-performing hyperparamter combination after n rounds of refinement.

Interface

To run the interactive search interface, you need to run two backend servers and start the GUI server:

cd interface/cqserver
python ClarifyAPI.py
cd interface/searchserver
python SearchAPI.py
cd interface/gui
npm start

By default, you can access the GUI at localhost:3000

Owner
Zachary Eberhart
Zachary Eberhart
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
RNN Predict Street Commercial Vitality

RNN-for-Predicting-Street-Vitality Code and dataset for Predicting the Vitality of Stores along the Street based on Business Type Sequence via Recurre

Zidong LIU 1 Dec 15, 2021
WiFi-based Multi-task Sensing

WiFi-based Multi-task Sensing Introduction WiFi-based sensing has aroused immense attention as numerous studies have made significant advances over re

zhangx289 6 Nov 24, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks We provide the code (in PyTorch) and datasets for our paper "On Size-Orient

Zemin Liu 4 Jun 18, 2022
Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Demetri Pananos 9 Oct 04, 2022
Assginment for UofT CSC420: Intro to Image Understanding

Run the code Open edge_detection.ipynb in google colab. Upload image1.jpg,image2.jpg and my_image.jpg to '/content/drive/My Drive'. chooose 'Run all'

Ziyi-Zhou 1 Feb 24, 2022
Finding an Unsupervised Image Segmenter in each of your Deep Generative Models

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models Description Recent research has shown that numerous human-interpretable

Luke Melas-Kyriazi 61 Oct 17, 2022
Improving Non-autoregressive Generation with Mixup Training

MIST Training MIST TRAIN_FILE=/your/path/to/train.json VALID_FILE=/your/path/to/valid.json OUTPUT_DIR=/your/path/to/save_checkpoints CACHE_DIR=/your/p

7 Nov 22, 2022
Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems

Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems This is our experimental code for RecSys 2021 paper "Learning

11 Jul 28, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Wenhao Wang 89 Jan 02, 2023
Source code for From Stars to Subgraphs

GNNAsKernel Official code for From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness Visualizations GNN-AK(+) GNN-AK(+) with Subgra

44 Dec 19, 2022
Progressive Growing of GANs for Improved Quality, Stability, and Variation

Progressive Growing of GANs for Improved Quality, Stability, and Variation — Official TensorFlow implementation of the ICLR 2018 paper Tero Karras (NV

Tero Karras 5.9k Jan 05, 2023
Trafffic prediction analysis using hybrid models - Machine Learning

Hybrid Machine learning Model Clone the Repository Create a new Directory as assests and download the model from the below link Model Link To Start th

1 Feb 08, 2022
A library that allows for inference on probabilistic models

Bean Machine Overview Bean Machine is a probabilistic programming language for inference over statistical models written in the Python language using

Meta Research 234 Dec 29, 2022
QI-Q RoboMaster2022 CV Algorithm

QI-Q RoboMaster2022 CV Algorithm

2 Jan 10, 2022
PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

samplernn-pytorch A PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model. It's based on the reference implem

DeepSound 261 Dec 14, 2022
DeepMReye: magnetic resonance-based eye tracking using deep neural networks

DeepMReye: magnetic resonance-based eye tracking using deep neural networks

73 Dec 21, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

233 Dec 29, 2022