Korean Sentence Embedding Repository

Overview

Korean-Sentence-Embedding

๐Ÿญ Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides environments where individuals can train models.

Baseline Models

Baseline models used for korean sentence embedding - KLUE-PLMs

Model Embedding size Hidden size # Layers # Heads
KLUE-BERT-base 768 768 12 12
KLUE-RoBERTa-base 768 768 12 12

NOTE: All the pretrained models are uploaded in Huggingface Model Hub. Check https://huggingface.co/klue.

How to start

  • Get datasets to train or test.
bash get_model_dataset.sh
  • If you want to do inference quickly, download the pre-trained models and then you can start some downstream tasks.
bash get_model_checkpoint.sh
cd KoSBERT/
python SemanticSearch.py

Available Models

  1. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks [SBERT]-[EMNLP 2019]
  2. SimCSE: Simple Contrastive Learning of Sentence Embeddings [SimCSE]-[EMNLP 2021]

KoSentenceBERT

  • ๐Ÿค— Model Training
  • Dataset
    • Train: snli_1.0_train.ko.tsv (First phase, training NLI), sts-train.tsv (Second phase, continued training STS)
    • Valid: sts-dev.tsv
    • Test: sts-test.tsv

KoSimCSE

  • ๐Ÿค— Model Training
  • Dataset
    • Train: snli_1.0_train.ko.tsv + multinli.train.ko.tsv
    • Valid: sts-dev.tsv
    • Test: sts-test.tsv

Performance

  • Semantic Textual Similarity test set results
Model Cosine Pearson Cosine Spearman Euclidean Pearson Euclidean Spearman Manhattan Pearson Manhattan Spearman Dot Pearson Dot Spearman
KoSBERTโ€ SKT 78.81 78.47 77.68 77.78 77.71 77.83 75.75 75.22
KoSBERTbase 82.13 82.25 80.67 80.75 80.69 80.78 77.96 77.90
KoSRoBERTabase 80.70 81.03 80.97 81.06 80.84 80.97 79.20 78.93
KoSimCSE-BERTโ€ SKT 82.12 82.56 81.84 81.63 81.99 81.74 79.55 79.19
KoSimCSE-BERTbase 82.73 83.51 82.32 82.78 82.43 82.88 77.86 76.70
KoSimCSE-RoBERTabase 83.64 84.05 83.32 83.84 83.33 83.79 80.92 79.84

Downstream Tasks

  • KoSBERT: Semantic Search, Clustering
python SemanticSearch.py
python Clustering.py
  • KoSimCSE: Semantic Search
python SemanticSearch.py

Semantic Search (KoSBERT)

from sentence_transformers import SentenceTransformer, util
import numpy as np

model_path = '../Checkpoint/KoSBERT/kosbert-klue-bert-base'

embedder = SentenceTransformer(model_path)

# Corpus with example sentences
corpus = ['ํ•œ ๋‚จ์ž๊ฐ€ ์Œ์‹์„ ๋จน๋Š”๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ๋นต ํ•œ ์กฐ๊ฐ์„ ๋จน๋Š”๋‹ค.',
          '๊ทธ ์—ฌ์ž๊ฐ€ ์•„์ด๋ฅผ ๋Œ๋ณธ๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ๋ง์„ ํƒ„๋‹ค.',
          'ํ•œ ์—ฌ์ž๊ฐ€ ๋ฐ”์ด์˜ฌ๋ฆฐ์„ ์—ฐ์ฃผํ•œ๋‹ค.',
          '๋‘ ๋‚จ์ž๊ฐ€ ์ˆ˜๋ ˆ๋ฅผ ์ˆฒ ์†ฆ์œผ๋กœ ๋ฐ€์—ˆ๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ๋‹ด์œผ๋กœ ์‹ธ์ธ ๋•…์—์„œ ๋ฐฑ๋งˆ๋ฅผ ํƒ€๊ณ  ์žˆ๋‹ค.',
          '์›์ˆญ์ด ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•œ๋‹ค.',
          '์น˜ํƒ€ ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋จน์ด ๋’ค์—์„œ ๋‹ฌ๋ฆฌ๊ณ  ์žˆ๋‹ค.']

corpus_embeddings = embedder.encode(corpus, convert_to_tensor=True)

# Query sentences:
queries = ['ํ•œ ๋‚จ์ž๊ฐ€ ํŒŒ์Šคํƒ€๋ฅผ ๋จน๋Š”๋‹ค.',
           '๊ณ ๋ฆด๋ผ ์˜์ƒ์„ ์ž…์€ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•˜๊ณ  ์žˆ๋‹ค.',
           '์น˜ํƒ€๊ฐ€ ๋“คํŒ์„ ๊ฐ€๋กœ ์งˆ๋Ÿฌ ๋จน์ด๋ฅผ ์ซ“๋Š”๋‹ค.']

# Find the closest 5 sentences of the corpus for each query sentence based on cosine similarity
top_k = 5
for query in queries:
    query_embedding = embedder.encode(query, convert_to_tensor=True)
    cos_scores = util.pytorch_cos_sim(query_embedding, corpus_embeddings)[0]
    cos_scores = cos_scores.cpu()

    #We use np.argpartition, to only partially sort the top_k results
    top_results = np.argpartition(-cos_scores, range(top_k))[0:top_k]

    print("\n\n======================\n\n")
    print("Query:", query)
    print("\nTop 5 most similar sentences in corpus:")

    for idx in top_results[0:top_k]:
        print(corpus[idx].strip(), "(Score: %.4f)" % (cos_scores[idx]))
  • Results are as follows :

Query: ํ•œ ๋‚จ์ž๊ฐ€ ํŒŒ์Šคํƒ€๋ฅผ ๋จน๋Š”๋‹ค.

Top 5 most similar sentences in corpus:
ํ•œ ๋‚จ์ž๊ฐ€ ์Œ์‹์„ ๋จน๋Š”๋‹ค. (Score: 0.6141)
ํ•œ ๋‚จ์ž๊ฐ€ ๋นต ํ•œ ์กฐ๊ฐ์„ ๋จน๋Š”๋‹ค. (Score: 0.5952)
ํ•œ ๋‚จ์ž๊ฐ€ ๋ง์„ ํƒ„๋‹ค. (Score: 0.1231)
ํ•œ ๋‚จ์ž๊ฐ€ ๋‹ด์œผ๋กœ ์‹ธ์ธ ๋•…์—์„œ ๋ฐฑ๋งˆ๋ฅผ ํƒ€๊ณ  ์žˆ๋‹ค. (Score: 0.0752)
๋‘ ๋‚จ์ž๊ฐ€ ์ˆ˜๋ ˆ๋ฅผ ์ˆฒ ์†ฆ์œผ๋กœ ๋ฐ€์—ˆ๋‹ค. (Score: 0.0486)


======================


Query: ๊ณ ๋ฆด๋ผ ์˜์ƒ์„ ์ž…์€ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•˜๊ณ  ์žˆ๋‹ค.

Top 5 most similar sentences in corpus:
์›์ˆญ์ด ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•œ๋‹ค. (Score: 0.6656)
์น˜ํƒ€ ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋จน์ด ๋’ค์—์„œ ๋‹ฌ๋ฆฌ๊ณ  ์žˆ๋‹ค. (Score: 0.2988)
ํ•œ ์—ฌ์ž๊ฐ€ ๋ฐ”์ด์˜ฌ๋ฆฐ์„ ์—ฐ์ฃผํ•œ๋‹ค. (Score: 0.1566)
ํ•œ ๋‚จ์ž๊ฐ€ ๋ง์„ ํƒ„๋‹ค. (Score: 0.1112)
ํ•œ ๋‚จ์ž๊ฐ€ ๋‹ด์œผ๋กœ ์‹ธ์ธ ๋•…์—์„œ ๋ฐฑ๋งˆ๋ฅผ ํƒ€๊ณ  ์žˆ๋‹ค. (Score: 0.0262)


======================


Query: ์น˜ํƒ€๊ฐ€ ๋“คํŒ์„ ๊ฐ€๋กœ ์งˆ๋Ÿฌ ๋จน์ด๋ฅผ ์ซ“๋Š”๋‹ค.

Top 5 most similar sentences in corpus:
์น˜ํƒ€ ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋จน์ด ๋’ค์—์„œ ๋‹ฌ๋ฆฌ๊ณ  ์žˆ๋‹ค. (Score: 0.7570)
๋‘ ๋‚จ์ž๊ฐ€ ์ˆ˜๋ ˆ๋ฅผ ์ˆฒ ์†ฆ์œผ๋กœ ๋ฐ€์—ˆ๋‹ค. (Score: 0.3658)
์›์ˆญ์ด ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•œ๋‹ค. (Score: 0.3583)
ํ•œ ๋‚จ์ž๊ฐ€ ๋ง์„ ํƒ„๋‹ค. (Score: 0.0505)
๊ทธ ์—ฌ์ž๊ฐ€ ์•„์ด๋ฅผ ๋Œ๋ณธ๋‹ค. (Score: -0.0087)

Clustering (KoSBERT)

from sentence_transformers import SentenceTransformer, util
import numpy as np

model_path = '../Checkpoint/KoSBERT/kosbert-klue-bert-base'

embedder = SentenceTransformer(model_path)

# Corpus with example sentences
corpus = ['ํ•œ ๋‚จ์ž๊ฐ€ ์Œ์‹์„ ๋จน๋Š”๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ๋นต ํ•œ ์กฐ๊ฐ์„ ๋จน๋Š”๋‹ค.',
          '๊ทธ ์—ฌ์ž๊ฐ€ ์•„์ด๋ฅผ ๋Œ๋ณธ๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ๋ง์„ ํƒ„๋‹ค.',
          'ํ•œ ์—ฌ์ž๊ฐ€ ๋ฐ”์ด์˜ฌ๋ฆฐ์„ ์—ฐ์ฃผํ•œ๋‹ค.',
          '๋‘ ๋‚จ์ž๊ฐ€ ์ˆ˜๋ ˆ๋ฅผ ์ˆฒ ์†ฆ์œผ๋กœ ๋ฐ€์—ˆ๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ๋‹ด์œผ๋กœ ์‹ธ์ธ ๋•…์—์„œ ๋ฐฑ๋งˆ๋ฅผ ํƒ€๊ณ  ์žˆ๋‹ค.',
          '์›์ˆญ์ด ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•œ๋‹ค.',
          '์น˜ํƒ€ ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋จน์ด ๋’ค์—์„œ ๋‹ฌ๋ฆฌ๊ณ  ์žˆ๋‹ค.',
          'ํ•œ ๋‚จ์ž๊ฐ€ ํŒŒ์Šคํƒ€๋ฅผ ๋จน๋Š”๋‹ค.',
          '๊ณ ๋ฆด๋ผ ์˜์ƒ์„ ์ž…์€ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•˜๊ณ  ์žˆ๋‹ค.',
          '์น˜ํƒ€๊ฐ€ ๋“คํŒ์„ ๊ฐ€๋กœ ์งˆ๋Ÿฌ ๋จน์ด๋ฅผ ์ซ“๋Š”๋‹ค.']

corpus_embeddings = embedder.encode(corpus)

# Then, we perform k-means clustering using sklearn:
from sklearn.cluster import KMeans

num_clusters = 5
clustering_model = KMeans(n_clusters=num_clusters)
clustering_model.fit(corpus_embeddings)
cluster_assignment = clustering_model.labels_

clustered_sentences = [[] for i in range(num_clusters)]
for sentence_id, cluster_id in enumerate(cluster_assignment):
    clustered_sentences[cluster_id].append(corpus[sentence_id])

for i, cluster in enumerate(clustered_sentences):
    print("Cluster ", i+1)
    print(cluster)
    print("")
  • Results are as follows:
Cluster  1
['ํ•œ ๋‚จ์ž๊ฐ€ ์Œ์‹์„ ๋จน๋Š”๋‹ค.', 'ํ•œ ๋‚จ์ž๊ฐ€ ๋นต ํ•œ ์กฐ๊ฐ์„ ๋จน๋Š”๋‹ค.', 'ํ•œ ๋‚จ์ž๊ฐ€ ํŒŒ์Šคํƒ€๋ฅผ ๋จน๋Š”๋‹ค.']

Cluster  2
['์›์ˆญ์ด ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•œ๋‹ค.', '๊ณ ๋ฆด๋ผ ์˜์ƒ์„ ์ž…์€ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋“œ๋Ÿผ์„ ์—ฐ์ฃผํ•˜๊ณ  ์žˆ๋‹ค.']

Cluster  3
['ํ•œ ๋‚จ์ž๊ฐ€ ๋ง์„ ํƒ„๋‹ค.', '๋‘ ๋‚จ์ž๊ฐ€ ์ˆ˜๋ ˆ๋ฅผ ์ˆฒ ์†ฆ์œผ๋กœ ๋ฐ€์—ˆ๋‹ค.', 'ํ•œ ๋‚จ์ž๊ฐ€ ๋‹ด์œผ๋กœ ์‹ธ์ธ ๋•…์—์„œ ๋ฐฑ๋งˆ๋ฅผ ํƒ€๊ณ  ์žˆ๋‹ค.']

Cluster  4
['์น˜ํƒ€ ํ•œ ๋งˆ๋ฆฌ๊ฐ€ ๋จน์ด ๋’ค์—์„œ ๋‹ฌ๋ฆฌ๊ณ  ์žˆ๋‹ค.', '์น˜ํƒ€๊ฐ€ ๋“คํŒ์„ ๊ฐ€๋กœ ์งˆ๋Ÿฌ ๋จน์ด๋ฅผ ์ซ“๋Š”๋‹ค.']

Cluster  5
['๊ทธ ์—ฌ์ž๊ฐ€ ์•„์ด๋ฅผ ๋Œ๋ณธ๋‹ค.', 'ํ•œ ์—ฌ์ž๊ฐ€ ๋ฐ”์ด์˜ฌ๋ฆฐ์„ ์—ฐ์ฃผํ•œ๋‹ค.']

References

@misc{park2021klue,
    title={KLUE: Korean Language Understanding Evaluation},
    author={Sungjoon Park and Jihyung Moon and Sungdong Kim and Won Ik Cho and Jiyoon Han and Jangwon Park and Chisung Song and Junseong Kim and Yongsook Song and Taehwan Oh and Joohong Lee and Juhyun Oh and Sungwon Lyu and Younghoon Jeong and Inkwon Lee and Sangwoo Seo and Dongjun Lee and Hyunwoo Kim and Myeonghwa Lee and Seongbo Jang and Seungwon Do and Sunkyoung Kim and Kyungtae Lim and Jongwon Lee and Kyumin Park and Jamin Shin and Seonghyun Kim and Lucy Park and Alice Oh and Jung-Woo Ha and Kyunghyun Cho},
    year={2021},
    eprint={2105.09680},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
@inproceedings{gao2021simcse,
   title={{SimCSE}: Simple Contrastive Learning of Sentence Embeddings},
   author={Gao, Tianyu and Yao, Xingcheng and Chen, Danqi},
   booktitle={Empirical Methods in Natural Language Processing (EMNLP)},
   year={2021}
}
@article{ham2020kornli,
  title={KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding},
  author={Ham, Jiyeon and Choe, Yo Joong and Park, Kyubyong and Choi, Ilji and Soh, Hyungjoon},
  journal={arXiv preprint arXiv:2004.03289},
  year={2020}
}
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "http://arxiv.org/abs/1908.10084",
}
Owner
Self-softmax
News-Articles-and-Essays - NLP (Topic Modeling and Clustering)

NLP T5 Project proposal Topic Modeling and Clustering of News-Articles-and-Essays Students: Nasser Alshehri Abdullah Bushnag Abdulrhman Alqurashi OVER

2 Jan 18, 2022
This repository will contain the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

1.1k Dec 27, 2022
NL-Augmenter ๐ŸฆŽ โ†’ ๐Ÿ A Collaborative Repository of Natural Language Transformations

NL-Augmenter ๐ŸฆŽ โ†’ ๐Ÿ The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
a chinese segment base on crf

Genius Geniusๆ˜ฏไธ€ไธชๅผ€ๆบ็š„pythonไธญๆ–‡ๅˆ†่ฏ็ป„ไปถ๏ผŒ้‡‡็”จ CRF(Conditional Random Field)ๆกไปถ้šๆœบๅœบ็ฎ—ๆณ•ใ€‚ Feature ๆ”ฏๆŒpython2.xใ€python3.xไปฅๅŠpypy2.xใ€‚ ๆ”ฏๆŒ็ฎ€ๅ•็š„pinyinๅˆ†่ฏ ๆ”ฏๆŒ็”จๆˆท่‡ชๅฎšไน‰break ๆ”ฏๆŒ็”จๆˆท่‡ชๅฎšไน‰ๅˆๅนถ่ฏ

duanhongyi 237 Nov 04, 2022
PyJPBoatRace: Python-based Japanese boatrace tools ๐Ÿšค

pyjpboatrace :speedboat: provides you with useful tools for data analysis and auto-betting for boatrace.

5 Oct 29, 2022
sangha, pronounced "suhng-guh", is a social networking, booking platform where students and teachers can share their practice.

Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github

Courtney Newcomer 17 Sep 29, 2021
Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET

Training COMET using seq2seq setting Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET. The codes are modified from run_summarizati

tqfang 9 Dec 17, 2022
Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

Maarten van Gompel 46 Dec 14, 2022
Finally, some decent sample sentences

tts-dataset-prompts This repository aims to be a decent set of sentences for people looking to clone their own voices (e.g. using Tacotron 2). Each se

hecko 19 Dec 13, 2022
A python gui program to generate reddit text to speech videos from the id of any post.

Reddit text to speech generator A python gui program to generate reddit text to speech videos from the id of any post. Current functionality Generate

Aadvik 17 Dec 19, 2022
Treemap visualisation of Maya scene files

Ever wondered which nodes are responsible for that 600 mb+ Maya scene file? Features Fast, resizable UI Parsing at 50 mb/sec Dependency-free, single-f

Marcus Ottosson 76 Nov 12, 2022
BiNE: Bipartite Network Embedding

BiNE: Bipartite Network Embedding This repository contains the demo code of the paper: BiNE: Bipartite Network Embedding. Ming Gao, Leihui Chen, Xiang

leihuichen 214 Nov 24, 2022
nlpcommon is a python Open Source Toolkit for text classification.

nlpcommon nlpcommon, Python Text Tool. Guide Feature Install Usage Dataset Contact Cite Reference Feature nlpcommon is a python Open Source

xuming 3 May 29, 2022
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Speech-Backbones This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab. Grad-TTS Official implementation of the Grad-

HUAWEI Noah's Ark Lab 295 Jan 07, 2023
A python framework to transform natural language questions to queries in a database query language.

__ _ _ _ ___ _ __ _ _ / _` | | | |/ _ \ '_ \| | | | | (_| | |_| | __/ |_) | |_| | \__, |\__,_|\___| .__/ \__, | |_| |_| |___/

Machinalis 1.2k Dec 18, 2022
Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

Combo List Fixer A simple python code to fix your combo list by removing any text after a separator or removing duplicate combos Removing any text aft

Hamidreza Dehghan 3 Dec 05, 2022
ๅฐๅธƒๅŠฉๆ‰‹ๅฏน่ฏ็Ÿญๆ–‡ๆœฌ่ฏญไน‰ๅŒน้…็š„ไธ€ไธชbaseline

oppo-text-match ๅฐๅธƒๅŠฉๆ‰‹ๅฏน่ฏ็Ÿญๆ–‡ๆœฌ่ฏญไน‰ๅŒน้…็š„ไธ€ไธชbaseline ๆจกๅž‹ ๅ‚่€ƒ๏ผšhttps://kexue.fm/archives/8213 base็‰ˆๆœฌ็บฟไธ‹ๅคงๆฆ‚0.952๏ผŒ็บฟไธŠ0.866๏ผˆๅ•ๆจกๅž‹๏ผŒๆฒกๅšK-flod่žๅˆ๏ผ‰ใ€‚ ่ฎญ็ปƒ ๆต‹่ฏ•็Žฏๅขƒ๏ผštensorflow 1.15 + keras

่‹ๅ‰‘ๆž—(Jianlin Su) 132 Dec 14, 2022
Transformer Based Korean Sentence Spacing Corrector

TKOrrector Transformer Based Korean Sentence Spacing Corrector License Summary This solution is made available under Apache 2 license. See the LICENSE

Paul Hyung Yuel Kim 3 Apr 18, 2022
ๆœฌ้กน็›ฎๆ˜ฏไฝœ่€…ไปฌๆ นๆฎไธชไบบ้ข่ฏ•ๅ’Œ็ป้ชŒๆ€ป็ป“ๅ‡บ็š„่‡ช็„ถ่ฏญ่จ€ๅค„็†(NLP)้ข่ฏ•ๅ‡†ๅค‡็š„ๅญฆไน ็ฌ”่ฎฐไธŽ่ต„ๆ–™๏ผŒ่ฏฅ่ต„ๆ–™็›ฎๅ‰ๅŒ…ๅซ ่‡ช็„ถ่ฏญ่จ€ๅค„็†ๅ„้ข†ๅŸŸ็š„ ้ข่ฏ•้ข˜็งฏ็ดฏใ€‚

ใ€ๅ…ณไบŽ NLPใ€‘้‚ฃไบ›ไฝ ไธ็Ÿฅ้“็š„ไบ‹ ไฝœ่€…๏ผšๆจๅค•ใ€่Š™่•–ใ€ๆŽ็Žฒใ€้™ˆๆตท้กบใ€twilightใ€LeoLRHใ€JimmyDUใ€่‰พๆ˜ฅ่พ‰ใ€ๅผ ๆฐธๆณฐใ€้‡‘้‡‘้‡‘ ไป‹็ป ๆœฌ้กน็›ฎๆ˜ฏไฝœ่€…ไปฌๆ นๆฎไธชไบบ้ข่ฏ•ๅ’Œ็ป้ชŒๆ€ป็ป“ๅ‡บ็š„่‡ช็„ถ่ฏญ่จ€ๅค„็†(NLP)้ข่ฏ•ๅ‡†ๅค‡็š„ๅญฆไน ็ฌ”่ฎฐไธŽ่ต„ๆ–™๏ผŒ่ฏฅ่ต„ๆ–™็›ฎๅ‰ๅŒ…ๅซ ่‡ช็„ถ่ฏญ่จ€ๅค„็†ๅ„้ข†ๅŸŸ็š„ ้ข่ฏ•้ข˜็งฏ็ดฏใ€‚ ็›ฎๅฝ•ๆžถๆž„ ไธ€ใ€ใ€

1.4k Dec 30, 2022
Big Bird: Transformers for Longer Sequences

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the c

Google Research 457 Dec 23, 2022