API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Overview

gpt-j-api 🦜

GitHub release (latest by date) Python version API up

An API to interact with the GPT-J language model. You can use and test the model in two different ways:

Using the API

  • Python:
import requests
context = "In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."
payload = {
    "context": context,
    "token_max_length": 512,
    "temperature": 1.0,
    "top_p": 0.9,
}
response = requests.post("http://api.vicgalle.net:5000/generate", params=payload).json()
print(response)
  • Bash:
curl -X 'POST' \
  'http://api.vicgalle.net:5000/generate?context=In%20a%20shocking%20finding%2C%20scientists%20discovered%20a%20herd%20of%20unicorns%20living%20in%20a%20remote%2C%20previously%20unexplored%20valley%2C%20in%20the%20Andes%20Mountains.%20Even%20more%20surprising%20to%20the%20researchers%20was%20the%20fact%20that%20the%20unicorns%20spoke%20perfect%20English.&token_max_length=512&temperature=1&top_p=0.9' \
  -H 'accept: application/json' \
  -d ''

Deployment of the API server

Just ssh into a TPU VM. This code was only tested on the v3-8 variants.

First, install the requirements and get the weigts:

python3 -m pip install -r requirements.txt
wget https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd
sudo apt install zstd
tar -I zstd -xf step_383500_slim.tar.zstd

And just run

python3 serve.py

Then, you can go to http://localhost:5000/docs and use the API!

Deploy the streamlit dashboard

Just run

python3 -m streamlit run streamlit_app.py --server.port 8000

Acknowledgements

Thanks to the support of the TPU Research Cloud, https://sites.research.google/trc/

Comments
  • I've made an extensions using this api

    I've made an extensions using this api

    https://chrome.google.com/webstore/detail/type-j/femdhcgkiiagklmickakfoogeehbjnbh

    You can check it out here

    First i was very hyped up and it felt fun, like I was talking to a machine, but then I lost my enthusiasm and now I feel like it's totally useless xD

    I'm just leaving a link here for you to appreciate you, it became real thanks for you posting this api

    feel free to delete the issue as it's out of scope

    if you got ideas on how to make it commercially succesful - i'll be happy to partner up

    peace

    opened by oogxdd 5
  • Illegal Instruction

    Illegal Instruction

    When installing like described in the readme (fresh conda env,python=3.8, ubuntu) I'll get a illegal instruction immediately after running python serve.py

    (gpt-j-api) […]@[…]:/opt/GPT/gpt-j-api$ python -q -X faulthandler serve.py
    Fatal Python error: Illegal instruction
    
    Current thread 0x00007f358d7861c0 (most recent call first):
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 1166 in create_module
      File "<frozen importlib._bootstrap>", line 556 in module_from_spec
      File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jaxlib/xla_client.py", lin
    e 31 in <module>
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/lib/__init__.py", line 58 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/config.py", line 26 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/__init__.py", line 33 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "serve.py", line 3 in <module>
    Illegal instruction (core dumped)
    

    EDIT

    running this on CPU only, I tried installing jax[CPU] - same resut

    opened by chris-aeviator 4
  • api seems offline

    api seems offline

    When I try to access the API I get the following error: ERR_CONNECTION_TIMED_OUT. But when I try to connect to it using a different IP address it does work. Am I IP banned?

    opened by KoenTech 4
  • Usage

    Usage

    I'm using this to host a Discord chatbot, and though I have slowmode on the channel there's still a lot of usage, and often the API is being used as fast as it can generate completions. Will this harm the experience for others? Should I limit it more? (thanks for making this free but I don't want to take advantage of that too much if it's bad for others)

    opened by Heath123 3
  • Alternative to Google TPU VM?

    Alternative to Google TPU VM?

    Hello,

    I would like to run a local instance of GPT-J, but avoid using Google.

    I have little to no experience in machine learning and its requirements, are there other solutions I could use? (What are the requirements for a machine in order to run GPT-J?)

    Thank you very much!

    opened by birkenbaum 3
  • Is there a way to speed up inference?

    Is there a way to speed up inference?

    Hello, I am currently working on a project where I need quick inference. It needn't be real-time, but something around 7-10 sec would be great. Is there a way to speed up the inference using the API?

    The model does not seem to be a problem as compute_time is around 8sec, but by the time the request arrives it takes around 20 seconds (over 30 on some occasions). Is there a way to make the request a bit faster?

    Thanks,

    opened by Aryagm 2
  • Errno 111

    Errno 111

    Could anyone please fix the following error? Thanks a lot.

    "ConnectionError: ...Failed to establish a new connection: [Errno 111] Connection refused"

    opened by Mather10 1
  • How to make the api public?

    How to make the api public?

    Hey, I was able to get serve.py running with the instructions you gave. But now I want to make the api public and connect it to a domain name so it can be publicly accessed (without needing a connection to the vm). How can I achieve this?

    I want to do the same thing you did with "http://api.vicgalle.net:5000/generate" and "http://api.vicgalle.net:5000/docs".

    Thanks,

    opened by Aryagm 1
  • API VM?

    API VM?

    Hi I wanted to host my own version of the api, where is the public one hosted? is it on a google cloud TPU VM? The ones ive seen here https://cloud.google.com/tpu/pricing are very expensive :D Is a TPU VM needed and the model won't be able to run on a normal GPU VM?

    Thanks!

    opened by jryebread 1
  • Raw text...

    Raw text...

    This is probably a very stupid question but whenever I run GPT-J I always get the full output:

    {'model': 'GPT-J-6B', 'compute_time': 1.2492187023162842, 'text': ' \n(and you\'ll be a slave)\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n', 'prompt': 'AI will take over the world ', 'token_max_length': 50, 'temperature': 0.09, 'top_p': 0.9, 'stop_sequence': None}

    What parameter do I need to change so it only outputs the generated text?

    (and you'll be a slave) I'm not a robot, I'm a human being. I'm not a robot, I'm a human being.

    opened by Vilagamer999 1
  • Latency with TPU VM

    Latency with TPU VM

    Got things running on Google Clouds, really happy :). Was hoping for a little but of a speed increase, but computation time is the same and latency on the request seems to be the main delay. Did you experiment with firewalls and ports to improve things?

    opened by Ontopic 1
  • Version support for Huggingface GPT-J 6B

    Version support for Huggingface GPT-J 6B

    GPT-J Huggingface and streamlit style like by project-code py

    from transformers import AutoTokenizer, AutoModelForCausalLM

    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

    model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-j-6B")

    opened by ghost 0
Releases(v0.3)
Owner
Víctor Gallego
Data scientist & predoc researcher
Víctor Gallego
Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Accelerated Sparse Neural Training: A Provable and Efficient Method to FindN:M Transposable Masks Recently, researchers proposed pruning deep neural n

itay hubara 4 Feb 23, 2022
This repository implements a brute-force spellchecker utilizing the Damerau-Levenshtein edit distance.

About spellchecker.py Implementing a highly-accurate, brute-force, and dynamically programmed spellchecking program that utilizes the Damerau-Levensht

Raihan Ahmed 1 Dec 11, 2021
Athena is an open-source implementation of end-to-end speech processing engine.

Athena is an open-source implementation of end-to-end speech processing engine. Our vision is to empower both industrial application and academic research on end-to-end models for speech processing.

Ke Technologies 34 Sep 08, 2022
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

MTFAA-Net Unofficial PyTorch implementation of Baidu's MTFAA-Net: "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speec

Shimin Zhang 87 Dec 19, 2022
🤗🖼️ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.

🤗 🖼️ HuggingPics Fine-tune Vision Transformers for anything using images found on the web. Check out the video below for a walkthrough of this proje

Nathan Raw 185 Dec 21, 2022
문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.

Namuwiki corpus 문장단위로 미리 분절된 나무위키 코퍼스. 목적이 LM등에서 사용하기 위한 데이터셋이라, 링크/이미지/테이블 등등이 잘려있습니다. 문장 단위 분절은 kss를 활용하였습니다. 라이선스는 나무위키에 명시된 바와 같이 CC BY-NC-SA 2.0

Jeong Ukjae 16 Apr 02, 2022
A tool helps build a talk preview image by combining the given background image and talk event description

talk-preview-img-builder A tool helps build a talk preview image by combining the given background image and talk event description Installation and U

PyCon Taiwan 4 Aug 20, 2022
IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models. Everything is pure Python and PyTorch based to keep it as simple and beginner-friendly, yet powerful as possible.

Digital Phonetics at the University of Stuttgart 247 Jan 05, 2023
DVC-NLP-Simple-usecase

dvc-NLP-simple-usecase DVC NLP project Reference repository: official reference repo DVC STUDIO MY View Bag of Words- Krish Naik TF-IDF- Krish Naik ST

SUNNY BHAVEEN CHANDRA 2 Oct 02, 2022
Text-Based zombie apocalyptic decision-making game in Python

Inspiration We shared university first year game coursework.[to gauge previous experience and start brainstorming] Adapted a particular nuclear fallou

Amin Sabbagh 2 Feb 17, 2022
Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

This Repository contains a sample code for Tacotron 2, WaveGlow with multi-speaker, emotion embeddings together with a script for data preprocessing.

Ivan Didur 106 Jan 01, 2023
edge-SR: Super-Resolution For The Masses

edge-SR: Super Resolution For The Masses Citation Pablo Navarrete Michelini, Yunhua Lu and Xingqun Jiang. "edge-SR: Super-Resolution For The Masses",

Pablo 40 Nov 10, 2022
Let Xiao Ai speakers control third-party devices

A stupid way to extend miot/xiaoai. Demo for Panasonic Bath Bully FV-RB20VL1 逆向 Panasonic Smart China,获得控制浴霸的请求信息(HTTP 请求),详见 apps/panasonic.py; 2. 通过

bin 14 Jul 07, 2022
Natural Language Processing Tasks and Examples.

Natural Language Processing Tasks and Examples With the advancement of A.I. technology in recent years, natural language processing technology has bee

Soohwan Kim 53 Dec 20, 2022
Module for automatic summarization of text documents and HTML pages.

Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim

Mišo Belica 3k Jan 08, 2023
CATs: Semantic Correspondence with Transformers

CATs: Semantic Correspondence with Transformers For more information, check out the paper on [arXiv]. Training with different backbones and evaluation

74 Dec 10, 2021
Conversational-AI-ChatBot - Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users!

Conversational AI ChatBot Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users! In this project? Thi

Rajkumar Lakshmanamoorthy 6 Nov 30, 2022
Word Bot for JKLM Bomb Party

Word Bot for JKLM Bomb Party A bot for Bomb Party on https://www.jklm.fun (Only English) Requirements pynput pyperclip pyautogui Usage: Step 1: Run th

Nicolas 7 Oct 30, 2022
Python utility library for compositing PDF documents with reportlab.

pdfdoc-py Python utility library for compositing PDF documents with reportlab. Installation The pdfdoc-py package can be installed directly from the s

Michael Gale 1 Jan 06, 2022