A computer vision pipeline to identify the "icons" in Christian paintings

Overview

Christian-Iconography

Open In Colab Screenshot from 2022-01-08 18-26-30

A computer vision pipeline to identify the "icons" in Christian paintings.

A bit about iconography.

Iconography is related to identifying the subject itself in the image. So, for instance when I say Christian Iconography I would mean that I am trying to identify some objects like crucifix or mainly in this project the saints!

Inspiration

I was looking for some interesting problem to solve and I came across RedHenLab's barnyard of projects and it had some really wonderful ideas there and this particular one intrigued me. On the site they didn't have much progress on it as the datasets were not developed on this subject but after surfing around I found something and just like that I got started!

Dataset used.

The project uses the ArtDL dataset which contains 42,479 images of artworks portraying Christian saints, divided in 10 classes: Saint Dominic (iconclass 11HH(DOMINIC)), Saint Francis of Assisi (iconclass 11H(FRANCIS)), Saint Jerome (iconclass 11H(JEROME)), Saint John the Baptist (iconclass 11H(JOHN THE BAPTIST)), Saint Anthony of Padua (iconclass 11H(ANTONY OF PADUA), Saint Mary Magdalene (iconclass 11HH(MARY MAGDALENE)), Saint Paul (iconclass 11H(PAUL)), Saint Peter (iconclass 11H(PETER)), Saint Sebastian (iconclass 11H(SEBASTIAN)) and Virgin Mary (iconclass 11F). All images are associated with high-level annotations specifying which iconography classes appear in them (from a minimum of 1 class to a maximum of 7 classes).

Sources

Screenshot from 2022-01-08 18-08-56

Preprocessing steps.

All the images were first padded so that the resolution is sort of intact when the image is resized. A dash of normalization and some horizontal flips and the dataset is ready to be eaten/trained on by our model xD.

Architecture used.

As mentioned the ArtDL dataset has around 43k images and hence training it completely wouldn't make sense. Hence a ResNet50 pretrained model was used.

But there is a twist.

Instead of just having the final classifying layer trained we only freeze the initial layer as it has gotten better at recognizing patterns from a lot of images it might have trained on. And then we fine-tune the deeper layers so that it learns the art after the initial abstraction. Another deviation is to replace the final linear layer by 1x1 conv layer to make the classification.

Quantiative Results.

Training

I trained the network for 10 epochs which took around 3 hours and used Stochastic Gradient Descent with LR=0.01 and momentum 0.9. The accuracy I got was 64% on the test set which can be further improved.

Classification Report

Screenshot from 2022-01-10 22-07-52

From the classification report it is clear that Saint MARY has the most number of samples in the training set and the precision for that is high. On the other hand other samples are low in number and hence their scores are low and hence we can't infer much except the fact that we need to oversample some of these classes so that we can gain more meaningful resuls w.r.t accuracy and of course these metrics as well

Qualitative Results

We try an image of Saint Dominic and see what our classifier is really learning.

Screenshot from 2022-01-10 22-10-37

Saliency Map

Screenshot from 2022-01-10 22-12-31

We can notice that regions around are more lighter than elsewhere which could mean that our classifier at least knows where to look :p

Guided-Backpropagation

Screenshot from 2022-01-10 22-14-26

So what really guided backprop does is that it points out the positve influences while classifiying an image. From this result we can see that it is really ignoring the padding applied and focussing more on the body and interesting enough the surroundings as well

Grad-CAM!

Screenshot from 2022-01-10 22-15-27

As expected the Grad-CAM when used shows the hot regions in our images and it is around the face and interesting enough the surrounding so maybe it could be that surroundings do have a role-play in type of saint?

Possible improvements.

  • Finding more datasets
  • Or working on the architecture maybe?
  • Using GANs to generate samples and make classifier stronger

Citations

@misc{milani2020data,
title={A Data Set and a Convolutional Model for Iconography Classification in Paintings},
author={Federico Milani and Piero Fraternali},
eprint={2010.11697},
archivePrefix={arXiv},
primaryClass={cs.CV},
year={2020}
}

RedhenLab's barnyard of projects

Owner
Rishab Mudliar
AKA Start At The Beginning.
Rishab Mudliar
An excellent hash algorithm combining classical sponge structure and RNN.

SHA-RNN Recurrent Neural Network with Chaotic System for Hash Functions Anonymous Authors [摘要] 在这次作业中我们提出了一种新的 Hash Function —— SHA-RNN。其以海绵结构为基础,融合了混

Houde Qian 5 May 15, 2022
A torch implementation of "Pixel-Level Domain Transfer"

Pixel Level Domain Transfer A torch implementation of "Pixel-Level Domain Transfer". based on dcgan.torch. Dataset The dataset used is "LookBook", fro

Fei Xia 260 Sep 02, 2022
Jetson Nano-based smart camera system that measures crowd face mask usage in real-time.

MaskCam MaskCam is a prototype reference design for a Jetson Nano-based smart camera system that measures crowd face mask usage in real-time, with all

BDTI 212 Dec 29, 2022
Multivariate Boosted TRee

Multivariate Boosted TRee What is MBTR MBTR is a python package for multivariate boosted tree regressors trained in parameter space. The package can h

SUPSI-DACD-ISAAC 61 Dec 19, 2022
This is the official implementation for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents" in NeurIPS 2021.

Observe then Incentivize Experiments This is the code used for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents",

Cong Shen Research Group 0 Mar 08, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

Andy Zeng 845 Jan 03, 2023
This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video] Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang CVPR 2021 This is re-implem

Ahmet Sarigun 79 Jan 05, 2023
A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

Bill Pung 1 Jun 05, 2022
Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Neural Contours: Learning to Draw Lines from 3D Shapes This repository contains the PyTorch implementation for CVPR 2020 Paper "Neural Contours: Learn

93 Dec 16, 2022
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP Andreas Fürst* 1, Elisabeth Rumetshofer* 1, Viet Tran1, Hubert Ramsauer1, Fei Tang3, Joh

Institute for Machine Learning, Johannes Kepler University Linz 133 Jan 04, 2023
The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

DGMS This is the code of the paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks". Installation Our code works with Pytho

Runpei Dong 3 Aug 28, 2022
When in Doubt: Improving Classification Performance with Alternating Normalization

When in Doubt: Improving Classification Performance with Alternating Normalization Findings of EMNLP 2021 Menglin Jia, Austin Reiter, Ser-Nam Lim, Yoa

Menglin Jia 13 Nov 06, 2022
A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022
Official implementation of "Articulation Aware Canonical Surface Mapping"

Articulation-Aware Canonical Surface Mapping Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani Paper Project Page Requirements Python

Nilesh Kulkarni 56 Dec 16, 2022
The Codebase for Causal Distillation for Language Models.

Causal Distillation for Language Models Zhengxuan Wu*,Atticus Geiger*, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D.

Zen 20 Dec 31, 2022
Official Implementation of "Learning Disentangled Behavior Embeddings"

DBE: Disentangled-Behavior-Embedding Official implementation of Learning Disentangled Behavior Embeddings (NeurIPS 2021). Environment requirement The

Mishne Lab 12 Sep 28, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network The official code of VisionLAN (ICCV2021). VisionLAN successfully a

81 Dec 12, 2022
Model of an AI powered sign language interpreter.

TEXT AND SPEECH TO SIGN LANGUAGE. A web application which takes in text or live audio speech recording as input, converts and displays the relevant Si

Mark Gatere 4 Mar 30, 2022