This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

Overview

2022_the_annotated_transformer

Goal

This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

Key points

  • We have re-factored Harvard NLP's Annotated Trasformer into a shell script version.

  • Dataset utilized Multi30K. (The dataset is small, so you can see the results quickly even on computers with low specifications.)

  • We provide the Colab version along with the shell script version, making it easy to modify the model and test the method.

    https://colab.research.google.com/drive/1SrRmC_Ti8IepeHFNBZBjNxl_wkTSJReC?usp=sharing

  • Loss Graph can be drawn.

  • BLEU Score can be measured.

file structure

├── models
│   ├── __init__.py
│   ├── blocks
│   │   ├── __init__.py
│   │   ├── decoder_layer.py
│   │   ├── encoder_layer.py
│   ├── embedding
│   │   ├── __init__.py
│   │   ├── positional_encoding.py
│   │   └── token_embedding.py
│   ├── layers
│   │   ├── __init__.py
│   │   ├── layer_norm.py
│   │   ├── multi_headed_attention.py
│   │   ├── position_wise_feed_forward.py
│   │   └── sublayer_connection.py
│   ├── model
│   │   ├── __init__.py
│   │   ├── decoder.py
│   │   ├── encoder_decoder.py
│   │   ├── encoder.py
│   │   ├── generator.py
│   └── util.py
├── result
│   ├── loss_graph.png
│   ├── train_loss.txt
│   └── valid_loss.txt
├── saved
├── utils
    ├── __init__.py
    ├── batch.py
    ├── batch_size_fn.py
    ├── bleu.py
    ├── data_loader.py
    ├── epoch_time.py
    ├── greedy_decode.py
    ├── label_smoothing.py
    ├── make_model.py
    ├── NoamOpt.py
    ├── run_epoch.py
    ├── simple_loss_compute.py
    └── tokenizer.py
├── README.md
├── test.py
├── train.py
├── config.py
├── data.py
└── graph.py

Training Result

Train Validation loss graph

image

Test set(unseen data) Translation Example

image

Test set(unseen data) BLEU Score Average: 35.870847920953594

Reference

https://nlp.seas.harvard.edu/2018/04/03/attention.html

https://jalammar.github.io/illustrated-transformer/

https://www.facebook.com/groups/TensorFlowKR/permalink/1618169785190740/

https://github.com/hyunwoongko/transformer

Owner
신재욱
신재욱
The next level Python obfuscator, nearly impossible to deobfuscate.

🐸 Kramer 🐸 Kramer is a next level obfuscation tool written in Python3 allowing you to obfuscate your Python3 code easily and securely. It uses Berse

Billy 114 Dec 26, 2022
A simple linux keylogger project.

The project This project is a simple linux keylogger. When activated, it registers all the actions made with the keyboard. The log files are registere

1 Oct 24, 2021
A python script to turn Ubuntu Desktop in a one stop security platform. The InfoSec Fortress installs the packages,tools, and resources to make Ubuntu 20.04 capable of both offensive and defensive security work.

infosec-fortress A python script to turn Ubuntu Desktop into a strong DFIR/RE System with some teeth (Purple Team Ops)! This is intended to create a s

James 41 Dec 30, 2022
A kAFL based hypervisor fuzzer which fully supports nested VMs

hAFL2 hAFL2 is a kAFL-based hypervisor fuzzer. It is the first open-source fuzzer which is able to target hypervisors natively (including Hyper-V), as

SafeBreach Labs 115 Dec 07, 2022
Anti Supercookie - Confusing the ISP & Escaping the Supercookie

Confusing the ISP & Escaping the Supercookie

Baris Dincer 2 Nov 22, 2022
Auto Tor Ip Changer

AutoTor Auto Tor Ip Changer for Linux! git clone https://github.com/Arest7/AutoTor cd AutoTor pip install -r requirements.txt python3 AutoTor.py follo

Ken Ryuguji 3 Jan 23, 2022
Security-TXT is a python package for retrieving, parsing and manipulating security.txt files.

Security-TXT is a python package for retrieving, parsing and manipulating security.txt files.

Frank 3 Feb 07, 2022
MassStringer, CTF Flag Finder

massStringer MassStringer, CTF Flag Finder Usage: python3 massStringer.py Enter absolute path of the directory to scan for flags Edit "flag = re.searc

SuperTsumu 4 Sep 06, 2022
Ensure secure infrastructure and consistency with the firewall rules

Python Port Scanner This script tries to check if it's possible to make a connection with the specific endpoint port. This is very useful to ensure se

Allan Avelar 7 Feb 26, 2022
ProxyLogon Full Exploit Chain PoC (CVE-2021–26855, CVE-2021–26857, CVE-2021–26858, CVE-2021–27065)

ExProlog ProxyLogon Full Exploit Chain PoC (CVE-2021–26855, CVE-2021–26857, CVE-2021–26858, CVE-2021–27065) Usage: exprolog.py [OPTIONS] ExProlog -

Herwono W. Wijaya 130 Dec 15, 2022
client attack remotely , this script was written for educational purposes only

client attack remotely , this script was written for educational purposes only, do not use against to any victim, which you do not have permission for it

9 Jun 05, 2022
The RDT protocol (RDT3.0,GBN,SR) implementation and performance evaluation code using socket

소켓을 이용한 RDT protocols (RDT3.0,GBN,SR) 구현 및 성능 평가 코드 입니다. 코드를 실행할때 리시버를 먼저 실행하세요. 성능 평가 코드는 패킷 전송 과정을 제외하고 시간당 전송률을 출력합니다. RDT3.0 GBN SR(버그 발견으로 구현중 입니

kimtaeyong98 0 Dec 20, 2021
MozDef: Mozilla Enterprise Defense Platform

MozDef: Documentation: https://mozdef.readthedocs.org/en/latest/ Give MozDef a Try in AWS: The following button will launch the Mozilla Enterprise Def

Mozilla 2.2k Jan 08, 2023
Rapidly enumerate subdomains and domains using rapiddns.io.

Description Simple python module (unofficial) allowing you to access data from rapiddns.io. You can also use it as a module. As mentioned on the rapid

27 Dec 31, 2022
Windows Stack Based Auto Buffer Overflow Exploiter

Autoflow - Windows Stack Based Auto Buffer Overflow Exploiter Autoflow is a tool that exploits windows stack based buffer overflow automatically.

Himanshu Shukla 19 Dec 22, 2022
POC for detecting the Log4Shell (Log4J RCE) vulnerability.

log4shell-poc-py POC for detecting the Log4Shell (Log4J RCE) vulnerability. Run on a system with python3 python3 log4shell-poc.py pathToTargetFile

BCC Risk Advisory 2 Dec 22, 2021
Looks at Python code to search for things which look "dodgy" such as passwords or diffs

dodgy Dodgy is a very basic tool to run against your codebase to search for "dodgy" looking values. It is a series of simple regular expressions desig

Landscape 112 Nov 25, 2022
CVE 2020-14871 Solaris exploit

CVE 2020-14871 Solaris exploit This is a basic ROP based exploit for CVE 2020-14871. CVE 2020-14871 is a vulnerability in Sun Solaris systems. The act

Robin Massink 2 Oct 25, 2022
OMIGOD! OM I GOOD? A free scanner to detect VMs vulnerable to one of the

omigood (OM I GOOD?) This repository contains a free scanner to detect VMs vulnerable to one of the "OMIGOD" vulnerabilities discovered by Wiz's threa

Marco Simioni 13 Jul 13, 2022
PreviewGram is for users that wants get a more private experience with the Telegram's Channel.

PreviewGram is for users that wants get a more private experience with the Telegram's Channel.

1 Sep 25, 2022