This repository contains the code to replicate the analysis from the paper "Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning"

Last update: Jan 04, 2022

Overview

Replication Code for 'Moving On' - Investigating Inventors' Ethnic Origins Using Supervised Learning

This repository contains the code to replicate the paper Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning.

Repository Structure

Datasets that were created in this analysis can be found in the folder 00_data_and_model. The trained and tuned LSTM classification model used for the analysis in this paper is stored in this folder as well and can be accessed under 00_data_and_model/model/name_origin_lstm.h5. The folder 01_create_training_dataset contains replication files used to construct the dataset of labeld names used to train the LSTM classification model. 02_model_training features the code to train the LSTM classifier. Lastly, the code for the descriptive analysis (using a random subsample of the paper'sb dataset) can be found in the folder 03_inventor_composition_analysis

Dependencies

Python (3.7)

joblib==1.0.1
matplotlib==3.3.1
numpy==1.19.2
pandas==1.1.3
pyreadr==0.3.5
scikit-learn==0.23.2
scipy==1.4.1
tensorflow==2.2.0
xgboost==0.90

Installing a virtual environment using the environment.yml or requirements.txt files is recommended.

R (4.0.1)

tidyverse
data.table
reticulate
tensorflow
keras
stringi
jsonlite
countrycode
viridis

References & Contact

Niggli, M. (2022), 'Moving On' -- Investigating Inventors' Ethnic Origins Using Supervised Learning, arXiv:2201.00578

If you have questions, please contact [email protected].

This repository contains the code to replicate the analysis from the paper "Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning"

Related tags

Overview

Replication Code for 'Moving On' - Investigating Inventors' Ethnic Origins Using Supervised Learning

Repository Structure

Dependencies

Python (3.7)

R (4.0.1)

References & Contact

Owner

Matthias Niggli

Massively parallel Monte Carlo diffusion MR simulator written in Python.

Code for "Localization with Sampling-Argmax", NeurIPS 2021

A simple, fast, and efficient object detector without FPN

The Video-based Accident Detection System built in Python

BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Collect super-resolution related papers, data, repositories

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

An Image compression simulator that uses Source Extractor and Monte Carlo methods to examine the post compressive effects different compression algorithms have.

Instance Semantic Segmentation List

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

Reinforcement learning models in ViZDoom environment

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Adaptable tools to make reinforcement learning and evolutionary computation algorithms.

A minimal implementation of face-detection models using flask, gunicorn, nginx, docker, and docker-compose

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

Code for testing various M1 Chip benchmarks with TensorFlow.

OMNIVORE is a single vision model for many different visual modalities

Implementation of "Semi-supervised Domain Adaptive Structure Learning"