A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Overview

Resources to Help Global Equality for PhDs in NLP / AI

This repo originates with a wish to promote Global Equality for people who want to do a PhD in NLP, following the idea that mentorship programs are an effective way to fight against segregation, according to The Human Networks (Jackson, 2019). Specifically, we wish people from all over the world and with all types of backgrounds can share the same source of information, so that success will be a reward to those who are determined and hardworking, regardless of external contrainsts.

One non-negligible reason for success is access to information, such as (1) knowing what a PhD in NLP is like, (2) knowing what top grad schools look for when reviewing PhD applications, (3) broadening your horizon of what is good work, (4) knowing how careers in NLP in both academia and industry are like, and many others.

Contributor: Zhijing Jin (PhD student in NLP at Max Planck Institute, co-organizer of the ACL Year-Round Mentorship Program).

You are welcome to be a collaborator, -- you can make an issue/pull request, and I can add you :).

Endorsers of this repo: Prof Rada Mihalcea (University of Michigan). Please add your name here (by a pull request) if you endorse this repo :).

Contents (Actively Updating)

Top Resources

  1. Online ACL Year-Round Mentorship Program: https://acl-mentorship.github.io (You can apply as a mentee, as a mentor, or as a volunteer. For mentees, you will be able to attend monthly zoom Q&A sessions hosted senior researchers in NLP. You will also join a global slack channel, where you can constantly post your questions, and we will collect answers from senior NLP researchers.)

Stage 1. (Non-PhD -> PhD) How to Apply to PhD?

  1. (Prof Philip [email protected]) Finding CS Ph.D. programs to apply to. [Video]

  2. (Prof Mor Harchol-Balter@CMU) Applying to Ph.D. Programs in Computer Science (2014). [Guide]

  3. (Prof Jason [email protected]) Advice for Research Students (last updated: 2021). [List of suggestions]

  4. (CS Rankings) Advice on Applying to Grad School in Computer Science. [Pointers]

  5. (Nelson Liu, [email protected]) Student Perspectives on Applying to NLP PhD Programs (2019). [Suggestions Based on Surveys]

  6. A Princeton CS Major's Guide to Applying to Graduate School. [List of suggestions]

  7. (John Hewitt, [email protected]) Undergrad to PhD, or not - advice for undergrads interested in research (2018). [Suggestions]

  8. (Kalpesh Krishna, [email protected] Amherst) Grad School Resources (2018). [Article] (This list lots of useful pointers!)

  9. (Prof Scott E. [email protected]) Quora answers on the LTI program at CMU (2017). [Article]

  10. (Albert Webson et al., [email protected] University) Resources for Underrepresented Groups, including Brown's Own Applicant Mentorship Program (2020, but we will keep updating it throughout the 2021 application season.) [List of Resources]

Specific Suggestions

  1. (Prof Nathan [email protected] University) Inside Ph.D. admissions: What readers look for in a Statement of Purpose. [Article]

Improve Your Proficiency with Tools

  1. (MIT 2020) The Missing Semester of Your CS Education (e.g., master the command-line, ssh into remote machines, use fancy features of version control systems).

Stage 2. (Doing PhD) How to Succeed in PhD?

  1. (Maxwell Forbes, [email protected]) Every PhD Is Different. [Suggestions]

  2. (Prof Mark [email protected], Prof Hanna M. [email protected] Amherst) How to be a successful PhD student (in computer science (in NLP/ML)). [Suggestions]

  3. (Andrej Karpathy) A Survival Guide to a PhD (2016). [Suggestions]

  4. (Prof Kevin [email protected]) Kevin Gimpel's Advice to PhD Students. [Suggestions]

  5. (Prof Marie [email protected] University) How to Succeed in Graduate School: A Guide for Students and Advisors (1994). [Article] [Part II]

  6. (Prof Eric [email protected]) Syllabus for Eric’s PhD students (incl. Prof's expectation for PhD students). [syllabus]

  7. (Prof H.T. [email protected]) Useful Thoughts about Research (1987). [Suggestions]

  8. (Prof Phil [email protected]) Networking on the Network: A Guide to Professional Skills for PhD Students (last updated: 2015). [Suggestions]

  9. (Prof Stephen C. [email protected]) Some Modest Advice for Graduate Students. [Article]

  10. (Prof Tao [email protected]) Graduate Student Survival/Success Guide. [Slides]

  11. (Mu [email protected]) 博士这五年 (A Chinese article about five years in PhD at CMU). [Article]

  12. (Karl Stratos) A Note to a Prospective Student. [Suggestions]

What Is Weekly Meeting with Advisors like?

  1. (Prof Jason [email protected]) What do PhD students talk about in their once-a-week meetings with their advisers during their first year? (2015). [Article]

  2. (Brown University) Guide to Meetings with Your Advisor. [Suggestions]

Practical Guides

  1. (Prof Srinivasan [email protected]) How to Read a Paper (2007). [Suggestions]

  2. (Prof Jason [email protected]) How to Read a Technical Paper (2009). [Suggestions]

  3. (Prof Jason [email protected]) How to write a paper? (2010). [Suggestions]

Memoir-Like Narratives

  1. (Prof Philip [email protected]) The Ph.D. Grind: A Ph.D. Student Memoir (last updated: 2015). [Video] (For the book, you have to dig deeply, and then you will find the book.)

  2. (Prof Tianqi [email protected]) 陈天奇:机器学习科研的十年 (2019) (A Chinese article about ten years of research in ML). [Article]

  3. (Jean Yang) What My PhD Was Like. [Article]

How to Excel Your Research

  1. The most important step: (Prof Jason [email protected]) How to Find Research Problems (1997). [Suggestions]

Grad School Fellowships

  1. (List compiled by CMU) Graduate Fellowship Opportunities [link]
  2. CYD Fellowship for Grad Students in Switzerland [link]

Other Books

  1. The craft of Research by Wayne Booth, Greg Colomb and Joseph Williams.

  2. How to write a better thesis by Paul Gruba and David Evans

  3. Helping Doctoral Students to write by Barbara Kamler and Pat Thomson

  4. The unwritten rules of PhD research by Marian Petre and Gordon Rugg

Stage 3. (After PhD -> Industry) How is life as an industry researcher?

  1. (Mu [email protected]) 工作五年反思 (A Chinese article about reflections on the five years working in industry). [Article]

Stage 4. (Being a Prof) How to get an academic position? And how to be a good prof?

  1. (Prof Jason [email protected]) How to write an academic research statement (when applying for a faculty job) (2017). [Article]

  2. (Prof Jason [email protected]) How to Give a Talk (2015). [Suggestions]

  3. (Prof Jason [email protected]) Teaching Philosophy. [Article]

Stage 5. (Whole Career Path) How to live out a life career as an NLP research?

  1. (Prof Charles [email protected] University, Prof Qiang [email protected])Crafting Your Research Future: A Guide to Successful Master's and Ph.D. Degrees in Science & Engineering. [Book]

Further Readings: Technical Materials to Improve Your NLP Research Skills

  1. (Prof Jason [email protected]) Technical Tutorials, Notes, and Suggested Reading (last updated: 2018) [Reading list]

Contributions

All types of contributions to this resource list is welcome. Feel free to open a Pull Request.

Contact: Zhijing Jin, PhD in NLP at Max Planck Institute for Intelligent Systems, working on NLP & Causality.

How to Cite This Repo

@misc{resources2021jin,
  author = {Zhijing Jin},
  title = {Resources to Help Global Equality for PhDs in NLP},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/zhijing-jin/nlp-phd-global-equality}}
}
Owner
PhD in NLP & Causality. Affiliated with Max Planck Institute, Germany & ETH & UMich. Supervised by Bernhard Schoelkopf, Rada Mihalcea, and Mrinmaya Sachan.
Unsupervised text tokenizer focused on computational efficiency

YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)

VK.com 847 Dec 19, 2022
Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

1.1k Dec 27, 2022
Speach Recognitions

easy_meeting Добро пожаловать в интерфейс сервиса автопротоколирования совещаний Easy Meeting. Website - http://cf5c-62-192-251-83.ngrok.io/ Принципиа

Maksim 3 Feb 18, 2022
Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers.

Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers. Cherche is meant to be used with small to medium sized corpora. C

Raphael Sourty 224 Nov 29, 2022
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

fluz 11 Nov 16, 2022
Text preprocessing, representation and visualization from zero to hero.

Text preprocessing, representation and visualization from zero to hero. From zero to hero • Installation • Getting Started • Examples • API • FAQ • Co

Jonathan Besomi 2.7k Jan 08, 2023
PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

YangHeng 567 Jan 07, 2023
REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

What is MUSE? MUSE stands for Multilingual Universal Sentence Encoder - multilingual extension (16 languages) of Universal Sentence Encoder (USE). MUS

Dani El-Ayyass 47 Sep 05, 2022
StarGAN - Official PyTorch Implementation

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Dec 30, 2022
Simple Text-To-Speech Bot For Discord

Simple Text-To-Speech Bot For Discord This is a very simple TTS bot for discord made with python. For this bot you need FFMPEG, see installation to se

1 Sep 26, 2022
초성 해석기 based on ko-BART

초성 해석기 개요 한국어 초성만으로 이루어진 문장을 입력하면, 완성된 문장을 예측하는 초성 해석기입니다. 초성: ㄴㄴ ㄴㄹ ㅈㅇㅎ 예측 문장: 나는 너를 좋아해 모델 모델은 SKT-AI에서 공개한 Ko-BART를 이용합니다. 데이터 문장 단위로 이루어진 아무 코퍼스나

Dawoon Jung 29 Oct 28, 2022
Contact Extraction with Question Answering.

contactsQA Extraction of contact entities from address blocks and imprints with Extractive Question Answering. Goal Input: Dr. Max Mustermann Hauptstr

Jan 2 Apr 20, 2022
Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library.

GI-Pi Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library. The SP0

Nick Bild 8 Dec 15, 2021
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 26 Dec 14, 2022
Natural Language Processing

NLP Natural Language Processing apps Multilingual_NLP.py start #This script is demonstartion of Mul

Ritesh Sharma 1 Oct 31, 2021
Extract Keywords from sentence or Replace keywords in sentences.

FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install

Vikash Singh 5.3k Jan 01, 2023
An open collection of annotated voices in Japanese language

声庭 (Koniwa): オープンな日本語音声とアノテーションのコレクション Koniwa (声庭): An open collection of annotated voices in Japanese language 概要 Koniwa(声庭)は利用・修正・再配布が自由でオープンな音声とアノテ

Koniwa project 32 Dec 14, 2022
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022
This is a MD5 password/passphrase brute force tool

CROWES-PASS-CRACK-TOOl This is a MD5 password/passphrase brute force tool How to install: Do 'git clone https://github.com/CROW31/CROWES-PASS-CRACK-TO

9 Mar 02, 2022
Experiments in converting wikidata to ftm

FollowTheMoney / Wikidata mappings This repo will contain tools for converting Wikidata entities into FtM schema. Prefixes: https://www.mediawiki.org/

Friedrich Lindenberg 2 Nov 12, 2021