CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

此版本基于Pytorch后端的huggingface进行实现。由于此实现使用了Oneflow的dataloader作为数据读入的方式，因此也需要安装Oneflow。其它框架的数据读取可以参考OneflowDataloaderToPytorchDataset类的实现。

使用说明

安装依赖（前置要求：已在环境中安装好Pytorch和Oneflow）

pip install transformers pandas
git clone https://github.com/tea321000/hugging_face_competition
cd hugging_face_competition

运行train_BERT_base.sh和train_BERT_large.sh 单机单卡的baseline。保持其它参数不变，通过调节shell文件里的hidden_size参数，即可观察不同hidden_size所占显存的变化（可通过watch -n 0.1 nvidia-smi直观观察）

python train.py \
--ofrecord_path sample_seq_len_512_example \
--lr 1e-4 --epochs 10 \
--train_batch_size 2 \
--seq_length=512 \
--max_predictions_per_seq=80 \
--num_hidden_layers=24 \
--num_attention_heads=16 \
--hidden_size=1024 \#要调节的参数
--vocab_size=30522

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

Related tags

Overview

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

使用说明

Owner

Ziqi Zhou

An open source framework for seq2seq models in PyTorch.

"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

TFIDF-based QA system for AIO2 competition

Reformer, the efficient Transformer, in Pytorch

Poetry PEP 517 Build Backend & Core Utilities

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Example code for "Real-World Natural Language Processing"

Blackstone is a spaCy model and library for processing long-form, unstructured legal text

Topic Modelling for Humans

LCG T-TEST USING EUCLIDEAN METHOD

TalkNet: Audio-visual active speaker detection Model

code for modular summarization work published in ACL2021 by Krishna et al

JaQuAD: Japanese Question Answering Dataset

Contains the code and data for our #ICSE2022 paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences"

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Mycroft Core, the Mycroft Artificial Intelligence platform.

Code for the paper "Are Sixteen Heads Really Better than One?"

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search