Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Overview

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding

TL;DR

1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 

논문 정리 발표에 들어갈 내용

  • 저자가 풀려고 하는 문제는 어떤 것인가?
  • 어떤 식으로 해결하고자 했는가. 어떤 장점이 있는가(시간 여유가 된다면, 이전에는 어떤 방법이 있었고 그 방법들의 단점)
  • 그 방법에 대한 intuition (수학 없이)
  • 방법에 대한 이해(수학적으로)
  • 방법의 성공성을 보여주기 위해 사용한 데이터, 메트릭, 성능비교
  • 부족하다 생각되는 것, 애매한 것, 혹은 좋았던 점 등의 Discussion point

리딩 리스트

Dates Paper(author) Year Presenter File upload Code explained
Class-Based n-gram Models of Natural Language(Peter F Brown, et al.) 1992 소연 설명
Efficient Estimation of Word Representations in Vector Space(Tomas Mikolov, et al) 2013 동진 발표
Distributed Representations of Words and Phrases and their Compositionality(Tomas Mikolov, et al) 2013 나연 설명 skip-gram, CBOW
Distributed Representations of Sentences and Documents(Quoc V. Le and Tomas Mikolov) 2014 기원 설명 Doc2Vec
GloVe: Global Vectors for Word Representation(Jeffrey Pennington, et al.) 2015 수정 설명
Skip-Thought Vectors(Ryan Kiros, et al.) 2015 기범 설명
Enriching Word Vectors with Subword Information(Piotr Bojanowski, et al.) 2017 은기 설명
Universal Sentence Encoder(Daniel Cer et al.) 2018

issue & 추가 스터디 자료

Dates Topic Presenter File upload
04/14 genism을 이용한 word2vec 사용 현지 링크
04/14 negative samping & subsampling 나경 링크
04/14 hierarchical softmax 소연 링크
04/14 negative contrastive estimation(NCE) 수정 링크

스터디 룰

  • 스터디 시간 : 목요일 저녁 9시 30분!
  • 스터디 분량 : 매주 1주씩! (프로덕트 서빙 커리큘럼 기간에 집중할 수 있게 그전에 끝내보아영)
    • 각각 읽고, 질문 최소 1개를 github issue에 올림(+ 거기에 대한 답변을 안다면 답변 달아주기!)
  • 발표자 : 해당 요일에 랜덤 선택. 발표 자료는 자유 양식
    • 논문 발표 : 발표자는 발표 후 정리 내용 해당 레포 폴더파서 업로드. 발표자 외 사람 중 공유하고 싶은 사람은 issues에 남기거나 file upload 에 마찬가지로 링크 추가 가능(자율)
    • 코드뷰 설명: 해당 논문 발표자는 다음주차에 코드뷰 설명(e.g, 어떤 라이브러리로 쉽게 쓸 수 있는지 usage 설명, 알고리즘이 복잡한 경우 코드뷰로 어떻게 구현되었는지 설명 등 본인 기호에 맞게)

참여자

강나경, 김소연, 김현지, 박기범, 임동진, 임수정, 정기원, 한나연 , 김은기

참고 링크

논문을 정리하는 틀과 issues를 통한 discussion이 좋았던 깃헙 레포 참고

리딩 리스트를 참고한 NLP Must Read paper 정리된 깃헙 레포 참고

국내 NLP 리뷰 모임 참고 (season1의 beginners에 중복되는 논문들 있어요!)

Owner
Soyeon Kim
Soyeon Kim
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
3rd Place Solution of the Traffic4Cast Core Challenge @ NeurIPS 2021

3rd Place Solution of Traffic4Cast 2021 Core Challenge This is the code for our solution to the NeurIPS 2021 Traffic4Cast Core Challenge. Paper Our so

7 Jul 25, 2022
D-NeRF: Neural Radiance Fields for Dynamic Scenes

D-NeRF: Neural Radiance Fields for Dynamic Scenes [Project] [Paper] D-NeRF is a method for synthesizing novel views, at an arbitrary point in time, of

Albert Pumarola 291 Jan 02, 2023
Source code, data, and evaluation details for “Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Formation, and Ramifications”

Analysis of cross-lingual citations in English papers Contents initial_analysis Source code, data, and evaluation details as published at ICADL2020 ci

Tarek Saier 1 Oct 27, 2022
Instance-based label smoothing for improving deep neural networks generalization and calibration

Instance-based Label Smoothing for Neural Networks Pytorch Implementation of the algorithm. This repository includes a new proposed method for instanc

Mohamed Maher 1 Aug 13, 2022
Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Jinsung Yoon 532 Dec 31, 2022
A really easy-to-use and powerful sudoku solver.

SodukuSolver This is a really useful sudoku solver with a Qt gui. USAGE Enter the numbers in and click "RUN"! If you don't want to wait, simply press

Ujhhgtg Teams 11 Jun 02, 2022
Code and data for "TURL: Table Understanding through Representation Learning"

TURL This Repo contains code and data for "TURL: Table Understanding through Representation Learning". Environment and Setup Data Pretraining Finetuni

SunLab-OSU 63 Nov 23, 2022
Semi-Supervised Learning, Object Detection, ICCV2021

End-to-End Semi-Supervised Object Detection with Soft Teacher By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai,

Microsoft 789 Dec 27, 2022
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF 🤮 : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

Chen-Hsuan Lin 539 Dec 28, 2022
ReferFormer - Official Implementation of ReferFormer

The official implementation of the paper: Language as Queries for Referring Video Object Segmentation Language as Queries for Referring Video Object S

Jonas Wu 232 Dec 29, 2022
HyDiff: Hybrid Differential Software Analysis

HyDiff: Hybrid Differential Software Analysis This repository provides the tool and the evaluation subjects for the paper HyDiff: Hybrid Differential

Yannic Noller 22 Oct 20, 2022
Implementing DropPath/StochasticDepth in PyTorch

%load_ext memory_profiler Implementing Stochastic Depth/Drop Path In PyTorch DropPath is available on glasses my computer vision library! Introduction

Francesco Saverio Zuppichini 13 Jan 05, 2023
Real-time ground filtering algorithm of cloud points acquired using Terrestrial Laser Scanner (TLS)

This repository contains tools to simulate the ground filtering process of a registered point cloud. The repository contains two filtering methods. The first method uses a normal vector, and fit to p

5 Aug 25, 2022
GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions. This is the official code release fo

owl 37 Dec 24, 2022
“英特尔创新大师杯”深度学习挑战赛 赛道3:CCKS2021中文NLP地址相关性任务

基于 bert4keras 的一个baseline 不作任何 数据trick 单模 线上 最高可到 0.7891 # 基础 版 train.py 0.7769 # transformer 各层 cls concat 明神的trick https://xv44586.git

孙永松 7 Dec 28, 2021
Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices Abstract For practical deep neural network design on mobile devices, it is e

11 Dec 30, 2022
Security evaluation module with onnx, pytorch, and SecML.

🚀 🐼 🔥 PandaVision Integrate and automate security evaluations with onnx, pytorch, and SecML! Installation Starting the server without Docker If you

Maura Pintor 11 Apr 12, 2022
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Vision Research Lab @ UCSB 26 Nov 29, 2022
Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation The code of: Context Decoupling Augmentation for Weakly Supervised Semanti

54 Dec 12, 2022