FairyTailor: Multimodal Generative Framework for Storytelling

Last update: Dec 30, 2022

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. FairyTailor adds another modality and modifies the text generation process to help producing a coherent and creative story.

Set-up (development)

After cloning the repository:

Client (Vue 2.6)

Install and check that the client compiles:

cd client
npm i
npm run build

Backend (FASTAPI)

Install and activate the environment (conda provided):

conda env create -f environment.yml
conda activate MultiModalStory

Install environment globally in the directory:

pip install -e .
pip install git+https://github.com/openai/CLIP.git

After installation run:

python -m spacy download en_core_web_sm

In python terminal:

nltk.download('wordnet')
nltk.download('sentiwordnet')
nltk.download('averaged_perceptron_tagger')

Large Data Management (dvc)

Our large data files are stored on IBM's Cloud Object Storage, and to pull data files from that platform you will use a special, read-only .dvc/config file.

dvc pull -f

Which will pull:

backend/outputs (five preset stories)
backend/story_generator/downloaded (transformers)
client/public/unsplash25k (styled images)

Running the framework during developemnt

Client:

cd client
npm run devw

Backend (with server auto reload):

uvicorn backend.server:app --reload --reload-dir backend

Open the uvicorn server localhost:8000 in your web browser

Modifications Ideas:

New huggingface transformer

Place the transformer in backend/story_generator/downloaded directory.
Update the current model path by changing the constant FINETUNED_GPT2_PATH in backend/story_generator/constants.py.

New images folder

Replace the folder client/public/unsplash25k/sketch_images1024 with yours.
Update the current path by changing the constant IMAGE_PATH in client/src/components/Constants.js.

API functionalities

Add functions to the backend endpoint at backend/server/main.py.
Update client/src/js/api/mainApi.js to call the backend endpoint from the client.
Update the corresponding user components in client/src/components.

FairyTailor: Multimodal Generative Framework for Storytelling

Related tags

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Set-up (development)

Client (Vue 2.6)

Backend (FASTAPI)

Large Data Management (dvc)

Running the framework during developemnt

Modifications Ideas:

New huggingface transformer

New images folder

API functionalities

Owner

Eden Bens

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Neural Fixed-Point Acceleration for Convex Optimization

Parameter Efficient Deep Probabilistic Forecasting

Official implementation of MSR-GCN (ICCV 2021 paper)

Python script to download the celebA-HQ dataset from google drive

Additional functionality for use with fastai’s medical imaging module

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A benchmark dataset for emulating atmospheric radiative transfer in weather and climate models with machine learning (NeurIPS 2021 Datasets and Benchmarks Track)

A testcase generation tool for Persistent Memory Programs.

a delightful machine learning tool that allows you to train, test and use models without writing code

Retinal vessel segmentation based on GT-UNet

Speedy Implementation of Instance-based Learning (IBL) agents in Python

Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

This repository is the official implementation of the Hybrid Self-Attention NEAT algorithm.