Pixel art search engine for opengameart

Overview

Pixel Art Reverse Image Search for OpenGameArt

What does the final search look like?

The final search with an example can be found here.

It looks like this: OpenGameArt Search

Why did I want a reverse image search for OpenGameArt?

I wanted to build a reverse image search for OpenGameArt as Google Image Search and TinEye don't give good results for it. I had previously generated a huge tile map to give an overview of similar images on OpenGameArt, but it wasn't very resource friendly on the web or image browser and had to be split into smaller files, plus it's not searchable in any way, just scrollable. So I wanted a way for people to explore what kind of art is available on OpenGameArt, and landed on using similarity search to browse the image space.

How did I do the crawling?

The first thing I had to do was retrieve the search results for the query I was interested in on OpenGameArt, mostly the 2D art. Then I had to retrieve each HTML page which was in the search results index and parse the HTML for links to files. OpenGameArt contains a lot of archive files like zip and rar files, so I then had to unpack them to get to the images.

For example here is a snippet showing how to parse the content page and get file links:

responseBody = await Common.ReadURIOrCache(blob, Common.BaseURI + page, client);

var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(responseBody);
var htmlBody = htmlDoc.DocumentNode.SelectSingleNode("//body");

foreach (var nNode in htmlBody.Descendants("a"))
{
    if (nNode.NodeType == HtmlNodeType.Element &&
        nNode.Attributes["href"] != null &&
        nNode.Attributes["href"].Value.Contains("/default/files/"))
    {
        msg.Add(HttpUtility.HtmlDecode(nNode.Attributes["href"].Value.Replace(Common.FileURI, "")));
    }
}

Which technology did I use for the crawling and how much did it cost?

I used Azure Functions to do the crawling steps, with some back and forth manual intervention to correct things as needed. Each step had its own queue and then put the job for the next step on the next queue. In the end the invocations cost around 50 USD on Azure, for let's say 10-20 million Function invocations if I remember correctly.

Which alternatives did I investigate?

I tried to use the open source Milvus database, but it crashed on my DigitalOcean server because I didn't have enough memory on it. Then I accidentally and luckily discovered the link to Pinecone in a Hacker News comment section and decided to use that instead, as the trial was free and I didn't have to expand my server memory to use Milvus. In the end I expanded my server anyway, but I didn't try Milvus again (at least not yet).

What data do you need on each image to create a reverse image search?

I used VGG16 feature extraction in my script for this. See the article for more information, but in essence it's 4096 32-bit floating point numbers for each image, which describe various features of the image, say for instance in a very simplified way how many stripes or squares it has or how green it is. But these features are based on neurons in the neural network for VGG16 (which is usually used for image classification), so the features could be more complicated than what is described by simple feature tags. And the reason we need these vectors is that it's easy to use Euclidean distance or cosine similarity or another measure on two vectors to see if they are similar, and then consequently the images are similar. Furthermore there is search technology on these vectors that enable quick search on a large amount of them.

Here's a simplified python script to show how to do the feature extraction:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# vim: ft=python ts=4 sw=4 sts=4 et fenc=utf-8

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import decode_predictions, preprocess_input
from tensorflow.keras.models import Model
from tensorflow.compiler import xla
import numpy as np
import time
import os
import sys
import PIL
import json
import math
import multiprocessing
from glob import glob
from PIL import Image
from io import BytesIO

model = VGG16(weights='imagenet', include_top=True)
feat_extractor = Model(inputs=model.input, outputs=model.get_layer("fc2").output)

def prepImage(img):
    x = np.array(img.resize((224, 224)).convert('RGB'))
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    return x

def main():
    'entry point'
    fname = 'demo.jpg'
    dt = Image.open(fname)
    pimg = prepImage(dt)

    print("Computing feature vector", fname)
    features = feat_extractor.predict(pimg)
    print(features)

if __name__ == '__main__':
    main()

Here's the output of the script:

[email protected] ~/public_html/ogasearch 0% ./test.py                                                                                                                                                                                                                                                                                                                         (git)-[gh-pages]
2021-04-07 18:48:03.158023: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-07 18:48:03.158082: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-04-07 18:48:07.783109: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-07 18:48:07.783485: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-04-07 18:48:07.783530: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-04-07 18:48:07.783580: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (frostpunk): /proc/driver/nvidia/version does not exist
2021-04-07 18:48:07.784058: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-04-07 18:48:07.784513: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-07 18:48:08.599925: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 411041792 exceeds 10% of free system memory.
2021-04-07 18:48:09.194634: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 411041792 exceeds 10% of free system memory.
2021-04-07 18:48:09.385612: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 411041792 exceeds 10% of free system memory.
2021-04-07 18:48:13.033066: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 411041792 exceeds 10% of free system memory.
Computing feature vector demo.jpg
2021-04-07 18:48:13.706621: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-04-07 18:48:13.717564: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2199995000 Hz
[[0.        3.1128967 1.5611947 ... 1.2625191 0.7709812 0.       ]]
./test.py  12.20s user 4.66s system 132% cpu 12.731 total

How did I maintain a link between the image URLs and the vector datebase feature vectors?

I also wanted to put all the image URLs into a SQL database in the end, and have a flag for whether I had made the VGG16 feature extraction and whether it was added to the vector database (Milvus or Pinecone. It's essential to be able to map back and forth between an integer primary key, which is used in Pineone, and the URL and perhaps other metadata that belongs to the image, as [Pinecone](https://www.pinecone.io/ doesn't store other metadata than the primary key. In the end I dumbed the SQL database to a tab separated text file and loaded it on query server startup.

How long time did it take?

I think I spent a week in total to run all the code to finish, each step taking on the order of a day or two, crawl, computing feature vectors. I don't remember how much time it took to insert the vectors into the Pinecone database, but I think it was not the most time-consuming step.

Two ways of searching: Words and images

  • There are two ways of searching, either you can put a keyword, which just plainly (and a bit slowly at O(n)) iterates linearly through the URLs looking for a string match. I stuck with linear search since it's simple to implement and all the URLs are kept in memory anyway so it's not that slow. I dumped all the URLs to a text file and loading it to memory on query server load instead of querying the SQL server each time.
  • The other way of searching is that you put an image URL, which will run feature extraction on your image (on my server) and then query Pinecone for similar vectors, which will map to primary keys, which in turn I look up in the list of URLs.
  • I also maintain a "reverse database" text file in order to link back to the OpenGameArt site for images found (there are some bugs with this I haven't fixed yet, in which case it just links to OpenGameArt main page). This file is also loaded on query server startup. Finally there is also a link under each image to search for similar images, which implicitly uses the second kind of query by image.

What are some problems I encountered?

At the end I also added a quick fix to remove near-duplicate image results which had an identical score. I ran into some troubles on the search page with "double" URL encoding, because I had stored the files using URL encoding in the file system, but I worked around it with some detection code on the frontend for when the browser double-encoded the URL-encoded file names. I recommend storing the crawled files without URL encoding. I regret that my scripts are not so high quality or polished, for example there are multiple steps in scripts and I change things by editing the script instead of taking command line arguments. I don't feel like posting snippets from the scripts and explaining as they are a bit messy. Additionally I moved the files from Azure storage to my DigitalOcean server mid-way, before processing the feature extraction, so there's some inconsistent data location handling.

What are the final takeaways?

  • I recommend doing the crawl perhaps on a cheaper substrate than Azure Functions and Azure Storage to save some money, for example your own server or a fixed price Cloud server. Well it just cost 50 USD but I could have done it for free on my DigitalOcean server, so that's why.
  • I recommend building a more robust crawler, idempotent and restartable at any point which it may terminate at or require some manual intervention (for example I exceeded the Azure Function maximum run time of 5 minutes on extracting some of the large zip files, so I extracted them running the Functions locally in VS Code).
  • One thing I regret that I didn't get done this time was to extract all the tiles from tile sheets to individual images for searching. That would have made the search even more useful. On the other hand it could have cluttered the similarity search with too many nearly identical images.

Conclusion and final remarks

  • It might also be useful to prototype the system using a bit of content, then run the whole pipeline end to end on all the content once you get it working, instead of completing the first step in crawl, then doing all feature extraction and then doing all database insertion as I did.
  • In conclusion what I made was a bit of a hack, not so robust scripting for updating for new content, but it worked fine as a prototype and gave decent image search results (not always so spot on, but I blame it on that the feature extraction is not really targeted at tiny pixel art (though resized/upscaled before feature extraction)).
  • It could be interesting to see whether Milvus could also deliver similar results, a side by side comparison of some kind, on speed and quality, but I found it much easier to use Pinecone since it's already up and running as a service so that I didn't have to run my own vector database.

Script Locations

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

JunHong 1 Nov 05, 2021
With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

Virtual Keyboard With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want. At

Güldeniz Bektaş 5 Jan 23, 2022
Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
Autonomous Driving project for Euro Truck Simulator 2

hope-autonomous-driving Autonomous Driving project for Euro Truck Simulator 2 Video: How is it working ? In this video, the program processes the imag

Umut Görkem Kocabaş 36 Nov 06, 2022
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 151 Dec 12, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 119 Dec 02, 2022
PianoVisuals - Create background videos synced with piano music using opencv

Steps Record piano video Use Neural Network to do body segmentation (video matti

Solbiati Alessandro 4 Jan 24, 2022
Hiiii this is the Spanish for Linux and win 10 and in the near future the english version of PortScan my new tool on which you can see what ports are Open only with the IP adress.

PortScanner-by-IIT PortScanner es una herramienta programada en Python3. Como su nombre indica esta herramienta escanea los primeros 150 puertos de re

5 Sep 19, 2022
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 08, 2021
deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

Automatic Weapon Detection Deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications. Loved the pro

Janhavi 4 Mar 04, 2022
color detection using python

colordetection color detection using python In this color detection Python project, we are going to build an application through which you can automat

Ruchith Kumar 1 Nov 04, 2021
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
Zoom , GoogleMeets에서 Vtuber 데뷔하기

EasyVtuber Facial landmark와 GAN을 이용한 Character Face Generation Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요! 악세사리는 어느정도 추가해도 잘 작동해요! 안타깝게도 RTX 2070

Gunwoo Han 140 Dec 23, 2022
Memory tests solver with using OpenCV

Human Benchmark project This project is OpenCV based programs which are puzzle solvers for 7 different games for https://humanbenchmark.com/. made as

Bahadır Araz 24 Dec 27, 2022
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
This is a real life mario project using python and mediapipe

real-life-mario This is a real life mario project using python and mediapipe How to run to run this just run - realMario.py file requirements This req

Programminghut 42 Dec 22, 2022
Python package for handwriting and sketching in Jupyter cells

ipysketch A Python package for handwriting and sketching in Jupyter notebooks. Usage A movie is worth a thousand pictures is worth a million words...

Matthias Baer 16 Jan 05, 2023
零样本学习测评基准,中文版

ZeroCLUE 零样本学习测评基准,中文版 零样本学习是AI识别方法之一。 简单来说就是识别从未见过的数据类别,即训练的分类器不仅仅能够识别出训练集中已有的数据类别, 还可以对于来自未见过的类别的数据进行区分。 这是一个很有用的功能,使得计算机能够具有知识迁移的能力,并无需任何训练数据, 很符合现

CLUE benchmark 27 Dec 10, 2022