Introduction
Wordle is a word puzzle game where players have 6 guesses to determine the correct 5-letter word. After each guess, the puzzle tells you if each letter is either:
- Not in the word OR
- In the word but a different location OR
- In the word and in that location
This was recently introduced to me and the simplicity and logic of it is beautiful. The NY Times Wordle and Wordle Unlimited are my favorites.
Since programming is fun, I thought why not create a Wordle Helper to assist in guessing the word and share it with the world.
Usage
The Wordle Helper lets one enter the excluded letters and lets one enter the included letters and whether or not they’re in a specific location or not. Based on that input, it’ll tell you the remaining eligible words and suggested words to use for the next guess. Some other Wordle helpers I’ve tried don’t let one specify the same letter is not in multiple locations which eliminates the candidate words further.
Try my Wordle Helper and leave your comments below. Read on for additional info.
Technical Details
The Wordle helper is hosted on a public page on a Salesforce community in my developer edition org. When it first loads, it retrieves all the 5 letter English words that are stored in two Apex classes. Those words were parsed out from this Source Dictionary at this Github Repo. A single Lightning Web Component (LWC) does all the processing in Javascript. The component has 15,918 5-letter words. This seems to be more than the NY Times Wordle and Wordle Unlimited use.
The words were embedded in Apex classes so one wouldn’t have to parse out the words on each load. Originally, the candidate word filtering was done in Apex but it was too slow because Apex doesn’t let one easily manipulate strings like Javascript does. So, the code was migrated to the LWC.
After the first release, the Kabble English Word Frequency data set was found. That metadata was embedded in the component and the Source Dictionary cross-references its words with that to sort the suggested and candidate words from most-to-least common. Since the Wordle puzzles tend to use more common words, this increases the likelihood of getting the word correct.
Next Suggested Words Algorithm
Instead of only showing all the candidate words, I wanted to suggest words for the next guess that would allow the player to remove as many candidate words as possible. To do that, the remaining candidate words are determined. The remaining candidate words are analyzed to count how many words are used by each letter. Next, the code tries to see if the top 5 letters make any words and if so, it shows them, If the top 5 letters make no words, then the top 4 letters are used to see if they make any words and if so, show them. If not, show any words that have the top 3 letters in them. Otherwise, show nothing.
Next, sort the words from most-to-least commonly used. The Wordle puzzles tend to use more common words so it’s more likely the more common ones will help more.
Get Eligible Words Algorithm
Each time the Get Eligible Words is clicked, the code uses the 15,918 5-letter words available and
- Removes the words that contain any of the “Excluded Letters”, if any.
- Using the remaining words, filter the words to only those that match all the “Included Character Rules” such as “The letter R is not the second letter” or “The letter E is the last letter”.
- Sort the words from most-to-least commonly used. The Wordle puzzles tend to use more common words so it’s more likely the more common ones will help more.
- Show the count of the candidate words.
- Show the candidate words.
GitHub Repo
The code is available at my Github WordleHelper Repo.
Analysis
Using this Source Dictionary at this Github Repo, I wanted to determine how many 5 letter English words there are. Surprisingly, it’s hard to come across a complete list of all English words because different sources have different lists. Supposedly, there are 1 million English words but roughly 170,000 in common use according to here. That Source Dictionary has approximately 370,000 words so that is good enough for this fellow.
The analysis started with counting how many words there are for each given word length. This was mostly curiosity.
Word Length | Count of Words of This Length |
1 | 26 |
2 | Have to Calculate correctly still |
3 | 2,130 |
4 | 7,186 |
5 | 15,918 |
6 | 29,874 |
7 | 41,998 |
8 | 51,627 |
9 | 53,402 |
10 | 45,872 |
11 | 37,539 |
12 | 29,125 |
13 | 20,944 |
14 | 14,149 |
15 | 8,846 |
16 | 5,182 |
17 | 8,846 |
18 | 1,471 |
19 | 760 |
20 | 359 |
21 | 168 |
22 | 74 |
23 | 31 |
24 | 12 |
25 | 8 |
26 | 0 |
27 | 3 |
28 | 2 |
29 | 2 |
30 | 0 |
31 | 1 |
Next, my goal was to determine a good starting word. For this, all 15,918 5-letter words had each letter checked to see how many words had that letter in them. The following table shows the most-to-least used letters and their word count. The letter A is the most used letter in 5-letter words which is surprising. I would’ve thought E or S.
The top 5 used letters, A E S O R, spell the word “arose” so that’s my starting word.
Letter | Word Count |
a | 8,392 |
e | 7,800 |
s | 6,537 |
o | 5,219 |
r | 5,143 |
i | 5,067 |
l | 4,246 |
t | 4,189 |
n | 4,043 |
u | 3,361 |
d | 2,811 |
c | 2,744 |
y | 2,521 |
m | 2,494 |
p | 2,299 |
h | 2,284 |
b | 2,089 |
g | 1,971 |
k | 1,743 |
f | 1,238 |
w | 1,171 |
v | 878 |
z | 474 |
j | 376 |
x | 361 |
q | 139 |
After getting the 5 letter word frequency data from Kabble, it has 39,932 5 letter words so let’s see what its word count for each letter is:
Letter | Word Count |
a | 21942 |
e | 18907 |
o | 14626 |
i | 13749 |
s | 13683 |
r | 12185 |
n | 11446 |
l | 10375 |
t | 9855 |
c | 7421 |
d | 6970 |
m | 6957 |
u | 6805 |
h | 5633 |
g | 5584 |
p | 5511 |
b | 5003 |
k | 4966 |
y | 4617 |
f | 3208 |
v | 2597 |
w | 2391 |
z | 1868 |
j | 1485 |
x | 1456 |
q | 420 |
Tips
- Use “Arose” as the starting word because those letters are the most used letters in the 5-letter words.
- When using a suggested word for the next guess, choose one that you know. This seems to produce better results because the puzzles tend to use more “common” words.
- Use words without duplicate letters to help eliminate additional letters if possible.
- If no suggested words found, choose a word you know from the candidate words. This tends to happen when there are few remaining candidate words.
Limitations
- The 5-word dictionary used appears to be a superset of the words used in the Wordle puzzles online so a suggested or candidate word shown may not be considered a word by the puzzle.
- This tool doesn’t guarantee the word will be guessed right.
To Dos
- Trim the 5-letter dictionary to only the “common” words used. One could code a little script to lookup each word at dictionary.com to see if it “exists” and exclude it if not. If one has this already, please share!
Get more 5-letter metadata. It would be very useful to rank the suggested and candidate words based on their “usage frequency” so that the more commonly used ones are shown first.Was able to get the list of English words and their frequency in literature from Kabble. Using that data, the suggested and candidate words are now sorted from most-to-least common.- Refactor the LWC Javascript code into service classes so it’s organized better.
- Genericize the Wordle helper to assist with Wordle puzzles of any length.
This was a lot of fun to produce and hopefully it helped you out too!