Analyzing the most strategic words to guess on Wordle, based on letter frequency distributions

Overview

wordle-analysis

Evaluating different heuristics to determine the most effective solving strategy and building an AI-powered assistant tool to help you win.

Read the article >>>
Play with the AI-based strategic helper tool >>>

The Data

12972 guessable words
2315 mystery words*

* = These words comprise a word bank that is hard-coded into the
Wordle source code and used to randomly pick the daily puzzle each day

Exploratory Analysis

Most common letters

In the words of Pat Sajak, "R, S, T, L, N, E". These are the most frequently appearing letters in the English language and are, as such, used in the Bonus Round of the game Wheel of Fortune. But I wanted to start this project by verifying if they are, in fact, the most frequent letters when we limit our scope to only 5-letter English words.

As it turns out, E, A, R, O, T, L, I, S are the most frequent letters that appear in 5-letter words. Now, quick, think of a 5-letter word using these letters!

Heatmap to analyze letter frequency by positions

Simple Scoring Heuristics

Suppose today's Wordle solution is CRIMP. Let's walk through some example guesses and unpack how to make sense of the resulting colored tiles.

Guess 1: RAISE => 🟨 🟩 => R is present in the word, but not in the right place
Guess 2: MOUNT => 🟨 => M is present in the word, but not in the right place
Guess 3: GRIME => 🟩 🟩 🟩 => R, I, and M are all correct and locked in
Guess 4: CRIMP => 🟩 🟩 🟩 🟩 🟩 => 🎉 yay, you solved the Wordle! 🎉

If you take each guessable word and use it to try to guess each of the 2,315 mystery words, we can get a sense of how much valuable information we obtain using the scoring system above. For each guess, let's count up the number of greens we get, the number of yellows, blacks. Then, using a weighted average to maximize greens and yellows, we can sort our list of guessable words to find the words that yield us, on average, the highest heuristic score. A list of the 5 top words using this approach is provided below! Try starting your Wordle with any one of these words next time and see how you do!

Guess Average Correct 🟩 Average Present 🟨 Average Absent Weighted Average Tile Score 🟩 🟨
SOARE 0.660043 1.107991 3.231965 2.428078
STARE 0.572786 1.192657 3.234557 2.338229
ROATE 0.541685 1.247516 3.210799 2.330886
RAILE 0.544708 1.225054 3.230238 2.314471
AROSE 0.538661 1.229374 3.231965 2.306695

Simulation Results

Approach Best Initial Guess
Max-size Prioritization RAISE
Max-entropy Prioritization SOARE
Max-splits Prioritization TRACE
Owner
Sejal Dua
Data Scientist & Software Engineer
Sejal Dua
Amazon Scraper: A command-line tool for scraping Amazon product data

Amazon Product Scraper: 2021 Description A command-line tool for scraping Amazon product data to CSV or JSON format(s). Requirements Python 3 pip3 Ins

49 Nov 15, 2021
Chat In Terminal - Chat-App in python

Chat In Terminal Hello all. 😉 Sockets and servers are vey important for connection and importantly chatting with others. 😂 😁 I have thought of maki

Shreejan Dolai 5 Nov 17, 2022
Microsoft Azure CLI - Azure Command-Line Interface

A great cloud needs great tools; we're excited to introduce Azure CLI, our next generation multi-platform command line experience for Azure.

Microsoft Azure 3.4k Dec 30, 2022
A fantasy life simulator and role-playing game hybrid distributed as CLI, written in Python 3.

Life is Fantasy Epic (LIFE) A fantasy life simulator and role-playing game hybrid distributed as CLI, written in Python 3. This repository will be pro

Pawitchaya Chaloeijanya 2 Oct 24, 2021
Simple CLI tool to track your cryptocurrency portfolio in real time.

Simple tool to track your crypto portfolio in realtime. It can be used to track any coin on the BNB network, even obscure coins that are not listed or trackable by major portfolio tracking applicatio

Trevor White 69 Oct 24, 2022
Detect secret in source code, scan your repo for leaks. Find secrets with GitGuardian and prevent leaked credentials. GitGuardian is an automated secrets detection & remediation service.

GitGuardian Shield: protect your secrets with GitGuardian GitGuardian shield (ggshield) is a CLI application that runs in your local environment or in

GitGuardian 1.2k Jan 06, 2023
Python wrapper and CLI utility to render LaTeX markup and equations as SVG using dvisvgm and svgo.

latex2svg Python wrapper and CLI utility to render LaTeX markup and equations as SVG using dvisvgm and svgo. Based on the original work by Tino Wagner

Matthias C. Hormann 4 Feb 18, 2022
slipit is a command line utility for creating archives with path traversal elements.

slipit is a command line utility for creating archives with path traversal elements. It is basically a successor of the famous evilarc utility with an extended feature set and improved base functiona

usd AG 35 Dec 23, 2022
🪛 A simple pydantic to Form FastAPI model converter.

pyfa-converter Makes it pretty easy to create a model based on Field [pydantic] and use the model for www-form-data. How to install? pip install pyfa_

20 Dec 22, 2022
Patool is a portable command line archive file manager

Patool Patool is an archive file manager. Various archive formats can be created, extracted, tested, listed, searched, repacked and compared with pato

318 Jan 04, 2023
Output Analyzer for you terminal commands

Output analyzer (OZER) You can specify a few words inside config.yaml file and specify the color you want to be used. installing: Install command usin

Ehsan Shirzadi 1 Oct 21, 2021
A Python-based command prompt concept which includes windows command emulation.

PythonCMD A Python-based command prompt concept which includes windows command emulation. Current features: echo: Input your message and it will be cl

1 Feb 05, 2022
Key-control - A tool for add keys to your Termux app

Key-Control Is a tool for add keys to your Termux app. Cara Penginstalan $ pkg u

Beereva.id 1 Feb 14, 2022
CLI tool to computes CO2 emissions of HPC computations following green-algorithms.org methodology

gqueue gqueue is a CLI (command line interface) tool that computes carbon footprint of HPC computations on clusters running slurm. It follows the meth

4 Dec 10, 2021
Wordle helper: help you print posible 5-character words based on you input

Wordle Helper This program help you print posible 5-character words based on you

Gwan Thanakrit Juthamongkhon 4 Jan 19, 2022
This is my fetch, with ascii arts from neofetch and internet

deadfetch This is my fetch, with ascii arts from neofetch and internet Installation First what you need its python Fedora sudo dnf install python3 sud

DedSec 8 Jan 20, 2022
Faza - Faza terminal, Faza help to beginners for pen testing

Faza terminal simple tool for pen testers Use small letter only for commands Don't use space after command 'help' for more information Installation gi

Ag3ntQ 5 Feb 20, 2022
CLI tool to fix linked references for dates.

Fix Logseq dates This is a CLI tool to fix the date references following a change in date format since the current version (0.4.4) of Logseq does not

Isaac Dadzie 5 May 18, 2022
Doing set operations on files considered as sets of lines

CLI tool that can be used to do set operations like union on files considering them as a set of lines. Notes It ignores all empty lines with whitespac

Partho 11 Sep 06, 2022
A very simple and lightweight ToDo app using python that can be used from the command line

A very simple and lightweight ToDo app using python that can be used from the command line

Nilesh Sengupta 2 Jul 20, 2022