API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Overview

gpt-j-api 🦜

GitHub release (latest by date) Python version API up

An API to interact with the GPT-J language model. You can use and test the model in two different ways:

Using the API

  • Python:
import requests
context = "In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."
payload = {
    "context": context,
    "token_max_length": 512,
    "temperature": 1.0,
    "top_p": 0.9,
}
response = requests.post("http://api.vicgalle.net:5000/generate", params=payload).json()
print(response)
  • Bash:
curl -X 'POST' \
  'http://api.vicgalle.net:5000/generate?context=In%20a%20shocking%20finding%2C%20scientists%20discovered%20a%20herd%20of%20unicorns%20living%20in%20a%20remote%2C%20previously%20unexplored%20valley%2C%20in%20the%20Andes%20Mountains.%20Even%20more%20surprising%20to%20the%20researchers%20was%20the%20fact%20that%20the%20unicorns%20spoke%20perfect%20English.&token_max_length=512&temperature=1&top_p=0.9' \
  -H 'accept: application/json' \
  -d ''

Deployment of the API server

Just ssh into a TPU VM. This code was only tested on the v3-8 variants.

First, install the requirements and get the weigts:

python3 -m pip install -r requirements.txt
wget https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd
sudo apt install zstd
tar -I zstd -xf step_383500_slim.tar.zstd

And just run

python3 serve.py

Then, you can go to http://localhost:5000/docs and use the API!

Deploy the streamlit dashboard

Just run

python3 -m streamlit run streamlit_app.py --server.port 8000

Acknowledgements

Thanks to the support of the TPU Research Cloud, https://sites.research.google/trc/

Comments
  • I've made an extensions using this api

    I've made an extensions using this api

    https://chrome.google.com/webstore/detail/type-j/femdhcgkiiagklmickakfoogeehbjnbh

    You can check it out here

    First i was very hyped up and it felt fun, like I was talking to a machine, but then I lost my enthusiasm and now I feel like it's totally useless xD

    I'm just leaving a link here for you to appreciate you, it became real thanks for you posting this api

    feel free to delete the issue as it's out of scope

    if you got ideas on how to make it commercially succesful - i'll be happy to partner up

    peace

    opened by oogxdd 5
  • Illegal Instruction

    Illegal Instruction

    When installing like described in the readme (fresh conda env,python=3.8, ubuntu) I'll get a illegal instruction immediately after running python serve.py

    (gpt-j-api) […]@[…]:/opt/GPT/gpt-j-api$ python -q -X faulthandler serve.py
    Fatal Python error: Illegal instruction
    
    Current thread 0x00007f358d7861c0 (most recent call first):
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 1166 in create_module
      File "<frozen importlib._bootstrap>", line 556 in module_from_spec
      File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jaxlib/xla_client.py", lin
    e 31 in <module>
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/lib/__init__.py", line 58 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/config.py", line 26 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/__init__.py", line 33 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "serve.py", line 3 in <module>
    Illegal instruction (core dumped)
    

    EDIT

    running this on CPU only, I tried installing jax[CPU] - same resut

    opened by chris-aeviator 4
  • api seems offline

    api seems offline

    When I try to access the API I get the following error: ERR_CONNECTION_TIMED_OUT. But when I try to connect to it using a different IP address it does work. Am I IP banned?

    opened by KoenTech 4
  • Usage

    Usage

    I'm using this to host a Discord chatbot, and though I have slowmode on the channel there's still a lot of usage, and often the API is being used as fast as it can generate completions. Will this harm the experience for others? Should I limit it more? (thanks for making this free but I don't want to take advantage of that too much if it's bad for others)

    opened by Heath123 3
  • Alternative to Google TPU VM?

    Alternative to Google TPU VM?

    Hello,

    I would like to run a local instance of GPT-J, but avoid using Google.

    I have little to no experience in machine learning and its requirements, are there other solutions I could use? (What are the requirements for a machine in order to run GPT-J?)

    Thank you very much!

    opened by birkenbaum 3
  • Is there a way to speed up inference?

    Is there a way to speed up inference?

    Hello, I am currently working on a project where I need quick inference. It needn't be real-time, but something around 7-10 sec would be great. Is there a way to speed up the inference using the API?

    The model does not seem to be a problem as compute_time is around 8sec, but by the time the request arrives it takes around 20 seconds (over 30 on some occasions). Is there a way to make the request a bit faster?

    Thanks,

    opened by Aryagm 2
  • Errno 111

    Errno 111

    Could anyone please fix the following error? Thanks a lot.

    "ConnectionError: ...Failed to establish a new connection: [Errno 111] Connection refused"

    opened by Mather10 1
  • How to make the api public?

    How to make the api public?

    Hey, I was able to get serve.py running with the instructions you gave. But now I want to make the api public and connect it to a domain name so it can be publicly accessed (without needing a connection to the vm). How can I achieve this?

    I want to do the same thing you did with "http://api.vicgalle.net:5000/generate" and "http://api.vicgalle.net:5000/docs".

    Thanks,

    opened by Aryagm 1
  • API VM?

    API VM?

    Hi I wanted to host my own version of the api, where is the public one hosted? is it on a google cloud TPU VM? The ones ive seen here https://cloud.google.com/tpu/pricing are very expensive :D Is a TPU VM needed and the model won't be able to run on a normal GPU VM?

    Thanks!

    opened by jryebread 1
  • Raw text...

    Raw text...

    This is probably a very stupid question but whenever I run GPT-J I always get the full output:

    {'model': 'GPT-J-6B', 'compute_time': 1.2492187023162842, 'text': ' \n(and you\'ll be a slave)\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n', 'prompt': 'AI will take over the world ', 'token_max_length': 50, 'temperature': 0.09, 'top_p': 0.9, 'stop_sequence': None}

    What parameter do I need to change so it only outputs the generated text?

    (and you'll be a slave) I'm not a robot, I'm a human being. I'm not a robot, I'm a human being.

    opened by Vilagamer999 1
  • Latency with TPU VM

    Latency with TPU VM

    Got things running on Google Clouds, really happy :). Was hoping for a little but of a speed increase, but computation time is the same and latency on the request seems to be the main delay. Did you experiment with firewalls and ports to improve things?

    opened by Ontopic 1
  • Version support for Huggingface GPT-J 6B

    Version support for Huggingface GPT-J 6B

    GPT-J Huggingface and streamlit style like by project-code py

    from transformers import AutoTokenizer, AutoModelForCausalLM

    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

    model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-j-6B")

    opened by ghost 0
Releases(v0.3)
Owner
Víctor Gallego
Data scientist & predoc researcher
Víctor Gallego
Korean Sentence Embedding Repository

Korean-Sentence-Embedding 🍭 Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides

80 Jan 02, 2023
PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Non-Autoregressive Transformer Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K.

Salesforce 261 Nov 12, 2022
ByT5: Towards a token-free future with pre-trained byte-to-byte models

ByT5: Towards a token-free future with pre-trained byte-to-byte models ByT5 is a tokenizer-free extension of the mT5 model. Instead of using a subword

Google Research 409 Jan 06, 2023
Cải thiện Elasticsearch trong bài toán semantic search sử dụng phương pháp Sentence Embeddings

Cải thiện Elasticsearch trong bài toán semantic search sử dụng phương pháp Sentence Embeddings Trong bài viết này mình sẽ sử dụng pretrain model SimCS

Vo Van Phuc 18 Nov 25, 2022
Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

SyntaxGen Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022) In this repo, we upload all the scripts for this work. Due to siz

Zhuosheng Zhang 3 Jun 13, 2022
A method for cleaning and classifying text using transformers.

NLP Translation and Classification The repository contains a method for classifying and cleaning text using NLP transformers. Overview The input data

Ray Chamidullin 0 Nov 15, 2022
A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

GuwenModels: 古文自然语言处理模型合集, 收录互联网上的古文相关模型及资源. A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

Ethan 66 Dec 26, 2022
☀️ Measuring the accuracy of BBC weather forecasts in Honolulu, USA

Accuracy of BBC Weather forecasts for Honolulu This repository records the forecasts made by BBC Weather for the city of Honolulu, USA. Essentially, t

Max Halford 12 Oct 15, 2022
PyTranslator é simultaneamente um editor e tradutor de texto com diversos recursos e interface feito com coração e 100% em Python

PyTranslator O Que é e para que serve o PyTranslator? PyTranslator é simultaneamente um editor e tradutor de texto em com interface gráfica que usa a

Elizeu Barbosa Abreu 1 May 12, 2022
AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

凌逆战 75 Dec 05, 2022
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt

Life4 3k Jan 06, 2023
Host your own GPT-3 Discord bot

GPT3 Discord Bot Host your own GPT-3 Discord bot i'd host and make the bot invitable myself, however GPT3 terms of service prohibit public use of GPT3

[something hillarious here] 8 Jan 07, 2023
Open Source Neural Machine Translation in PyTorch

OpenNMT-py: Open-Source Neural Machine Translation OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine trans

OpenNMT 5.8k Jan 04, 2023
glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g

Rhasspy 8 Dec 25, 2022
XLNet: Generalized Autoregressive Pretraining for Language Understanding

Introduction XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective.

Zihang Dai 6k Jan 07, 2023
Telegram bot to auto post messages of one channel in another channel as soon as it is posted, without the forwarded tag.

Channel Auto-Post Bot This bot can send all new messages from one channel, directly to another channel (or group, just in case), without the forwarded

Aditya 128 Dec 29, 2022
CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training This is the official repository for the code and models of the paper CCQA: A N

Meta Research 29 Nov 30, 2022
Almost State-of-the-art Text Generation library

Ps: we are adding transformer model soon Text Gen 🐐 Almost State-of-the-art Text Generation library Text gen is a python library that allow you build

Emeka boris ama 63 Jun 24, 2022
A demo for end-to-end English and Chinese text spotting using ABCNet.

ABCNet_Chinese A demo for end-to-end English and Chinese text spotting using ABCNet. This is an old model that was trained a long ago, which serves as

Yuliang Liu 45 Oct 04, 2022
TPlinker for NER 中文/英文命名实体识别

本项目是参考 TPLinker 中HandshakingTagging思想,将TPLinker由原来的关系抽取(RE)模型修改为命名实体识别(NER)模型。

GodK 113 Dec 28, 2022