ICLR 2022 Paper submission trend analysis

Last update: Dec 06, 2022

Related tags

Data Analysis ICLR2022-OpenReviewData

Overview

Visualize ICLR 2022 OpenReview Data

ICLR 2022 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2022/Conference

Requirements

pip install wordcloud nltk pandas imageio selenium tqdm

download nltk packages

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
nltk.download('stopwords')

if you got anything wrong when calling webdriver.Edge('msedgedriver.exe'), you can

Delete msedgedriver.exe since it may only work on my computer (Windows)
Install Microsoft Edge (Chromium): Ensure you have installed Microsoft Edge (Chromium). To confirm that you have Microsoft Edge (Chromium) installed, go to edge://settings/help in the browser, and verify the version number is Version 75 or later.
Download Microsoft Edge Driver:
- Go to edge://settings/help to get the version of Edge.
Navigate to the Microsoft Edge Driver downloads page and download the driver that matches the Edge version number.

From https://stackoverflow.com/questions/63529124/how-to-open-up-microsoft-edge-using-selenium-and-python

Crawl Data

Run crawl_paperlist.py to crawl the list of papers (~0.5h).

Paper List (3,407 submission in total

crawl_paperlist.py only crawls 3,000 papers, but it has 3,407 in total. The full paper list are in follows:

Visualization

Keywords Frequency

The top 50 common keywords (uncased) and their frequency:

Keywords Cloud

The word clouds formed by keywords of submissions show the hot topics including deep learning, reinforcement learning, representation learning, graph neural network, etc.

Title Keywords Frequency

The top 50 common title keywords (uncased) and their frequency:

Title Keywords Cloud

The word clouds formed by keywords of submission titles:

Acknowledgment

Inspired by this repo: https://github.com/evanzd/ICLR2021-OpenReviewData

ICLR 2022 Paper submission trend analysis

Related tags

Overview

Visualize ICLR 2022 OpenReview Data

Requirements

Crawl Data

Paper List (3,407 submission in total

Visualization

Acknowledgment

Owner

Jintang Li

CleanX is an open source python library for exploring, cleaning and augmenting large datasets of X-rays, or certain other types of radiological images.

Example Of Splunk Search Query With Python And Splunk Python SDK

AWS Glue ETL Code Samples

BAyesian Model-Building Interface (Bambi) in Python.

Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Incubator for useful bioinformatics code, primarily in Python and R

Data collection, enhancement, and metrics calculation.

CSV database for chihuahua (HUAHUA) blockchain transactions

Includes all files needed to satisfy hw02 requirements

Building house price data pipelines with Apache Beam and Spark on GCP

BigDL - Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems

A pipeline that creates consensus sequences from a Nanopore reads. I

Senator Trades Monitor

A lightweight, hub-and-spoke dashboard for multi-account Data Science projects

Picka: A Python module for data generation and randomization.

Implementation in Python of the reliability measures such as Omega.

ASTR 302: Python for Astronomy (Winter '22)

Automated Exploration Data Analysis on a financial dataset

Binance Kline Data With Python

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains