Recently we’ve discussed optimizing SQL queries in FileMaker, and had some fun with various SQL experiments. Today we’re going to explore some ways FileMaker can use ExecuteSQL and the While function to perform letter frequency and text pattern analysis on candidate words for the popular Wordle game.
The list of words comes from https://github.com/tabatkins/wordle-list and purports to include the actual answer words, as well as all allowable guess words. I don’t know how valid this list of words actually is, or, assuming it is currently valid, whether it is carved in stone or will change at some point in the future.
Demo file: SQL-Multi-Table-Experimentation-Wordle.zip
Some Notes
- Today’s file is functionally identical to the one from last time; if you already have it, there’s no need to download this one.
- No attempt is made to differentiate between daily Wordle words that have already appeared vs. those that have yet to appear.
- SQL is case-sensitive in the WHERE clause; all our examples today use lower case letters so we may safely ignore the issue for the duration of this article.
- For a general-purpose introduction to SQL in FileMaker, see Beverly Voth’s Missing FM 12 ExecuteSQL Reference.
About the Demo
Today we will be looking at the “words” table, which has 12,947 entries — each of which is a five letter word — comprising all eligible Wordle guess words, with the actual answer words indicated by a 1 in the “wordle” field.
All of today’s examples will involve typing or pasting queries into the data viewer, and just to get our feet wet, we can generate a list of all words, and a count of all words, via these simple statements.
And we can narrow things down to just the Wordle answer words by adding a WHERE clause.
Example 1: Overall Letter Frequency
If we take 2,309 as our total number of answer words, and multiply by five, that gives us 11,545 letters in “Wordlespace”. To determine how many times each letter appears, we can perform a rudimentary frequency analysis like so…
While ( [ letterList = "a¶b¶c¶d¶e¶f¶g¶h¶i¶j¶k¶l¶m¶n¶o¶p¶q¶r¶s¶t¶u¶v¶w¶x¶y¶z" ; wordleList = ExecuteSQL ( " SELECT word FROM words WHERE wordle = 1 " ; "" ; "" ) ; vc = ValueCount ( letterList ) ; counter = 1 ; result = "" ] ; counter <= vc ; [ letter = GetValue ( letterList ; counter ) ; frequency = PatternCount ( wordleList ; letter ) ; result = List ( result ; letter & " • " & frequency ) ; counter = counter + 1 ] ; result )
…producing an alphabetically sorted list that looks like this.
Explanation:
- Generate a list of the 26 letters in the English alphabet
- Generate a list of all words with 1 in the wordle field
- For each letter calculate how many times it appears in the word list
Note: if you aren’t familiar with the While function, or could use a refresher, see
Can we sort the results by frequency, rather than alphabetically? Yes, by wrapping the result from the above example in SortValues and using the “-2” (numeric, descending) argument.
Viewed this way, one gets the idea that “orate” might be a good starting word. We’ll return to frequency analysis in our final example.
Example 2: String Matching
Suppose you’re curious re: how many Wordle words contain a certain substring; you don’t care where in the word the string appears, but you do want the characters to appear sequentially… for example the three-character substring “tor”.
ExecuteSQL ( " SELECT word FROM words WHERE wordle = 1 AND word LIKE ? " ; "" ; "" ; "%tor%" )
Explanation: The “%” wildcard indicates zero or more characters. By placing the % on either side of the substring, we are indicating that the substring may occur anywhere within the word.
Example 3: Filling In Blanks
Let’s say you’ve guessed B, A and E, as, respectively, the first, third and fifth letters, but don’t know the second and fourth ones. You can use the LIKE operator and the single-character wildcard “_” to display matching candidates.
ExecuteSQL ( " SELECT word FROM words WHERE wordle = 1 AND word LIKE ? " ; "" ; "" ; "b_a_e" )
Explanation: The “_” wildcard represents a single character.
Example 4: Words Containing Four Vowels (A, E, I, O or U)
Five letter words containing four vowels are a rare occurrence in the English language, so let’s initially do this query without restricting the results to Wordle words.
While ( [ wordlist = ExecuteSQL ( "SELECT word FROM words" ; "" ; "" ) ; result = "" ; vc = ValueCount ( wordlist ) ; counter = 1 ] ; counter <= vc ; [ item = GetValue ( wordlist ; counter ) ; vowels = Filter ( item ; "aeiou" ) ; match = If ( Length ( vowels ) >= 4 ; item ; "" ) ; result = List ( result ; match ) ; counter = counter + 1 ] ; result )
Explanation:
- Generate a list of all words
- Iterate through the word list
- Populate “match” if a given word contains 4 or more vowels
- Build up the result list with non-empty match words
And if we want to restrict results to just Wordle words…
Example 5: Words Without A, E, I, O or U
What if we want to turn things around, and list Wordle words that don’t contain an A, E, I, O or U?
While ( [ theList = ExecuteSQL ( " SELECT word FROM words WHERE wordle = 1 " ; "" ; "" ) ; vc = ValueCount ( theList ) ; counter = 1 ; result = "" ] ; counter <= vc ; [ itemRaw = GetValue ( theList ; counter ) ; item = If ( isEmpty ( Filter ( itemRaw ; "aeiou" ) ) ; itemRaw ) ; result = List ( result ; item ) ; counter = counter + 1 ] ; result )
Explanation:
- Generate a list of all Wordle words
- Iterate through the word list
- Populate “item” if a given word contains no standard vowels
- Build up the result list with non-empty item words
Example 6: Position-Specific Letter Frequency
Returning to the topic of frequency analysis, in example 1 we looked at overall letter frequency. This time we’d like to generate a position-specific list, and once again, a combination of While and ExecuteSQL can get the job done.
While ( [ letterList = "a¶b¶c¶d¶e¶f¶g¶h¶i¶j¶k¶l¶m¶n¶o¶p¶q¶r¶s¶t¶u¶v¶w¶x¶y¶z" ; vc = ValueCount ( letterList ) ; counter = 1 ; result = "" ] ; counter <= vc ; [ letter = GetValue ( letterList ; counter ) ; f1 = ExecuteSQL ( "SELECT COUNT ( * ) FROM words WHERE SUBSTRING ( word, 1, 1 ) = ? AND wordle = 1 " ; "" ; "" ; letter ) ; f2 = ExecuteSQL ( "SELECT COUNT ( * ) FROM words WHERE SUBSTRING ( word, 2, 1 ) = ? AND wordle = 1 " ; "" ; "" ; letter ) ; f3 = ExecuteSQL ( "SELECT COUNT ( * ) FROM words WHERE SUBSTRING ( word, 3, 1 ) = ? AND wordle = 1 " ; "" ; "" ; letter ) ; f4 = ExecuteSQL ( "SELECT COUNT ( * ) FROM words WHERE SUBSTRING ( word, 4, 1 ) = ? AND wordle = 1 " ; "" ; "" ; letter ) ; f5 = ExecuteSQL ( "SELECT COUNT ( * ) FROM words WHERE SUBSTRING ( word, 5, 1 ) = ? AND wordle = 1 " ; "" ; "" ; letter ) ; result = List ( result ; letter & " • " & f1 & " " & f2 & " " & f3 & " " & f4 & " " & f5 ) ; counter = counter + 1 ] ; result )
Explanation:
- Generate a list of the 26 letters in the English alphabet
- Iterate through the letter list
- For each letter issue five SQL statements to calculate the frequency of that letter for each position
- Build up the result list accordingly
Some interesting takeaways…
- “s” is the most common initial letter
- “e” and then “y” are the most common ending letters
- “a” is most common letter at positions 2 and 3
- “e” is the most common letter at position 4
- no words begin with “x”
- no words end in “j”, “q” or “v”
- no word has “q” in the fourth position
- only one word ends in “u”
Closing Remarks
A good resource for all things Wordle is this community on Reddit.
Initially, there was a list of ~2300 words for possible answers that the creator’s girlfriend set up. I don’t know if this carried over to the NYT version. Here is an interesting take on how to solve it.
Hi Blair, completely agree. A lot of food for thought in there (and not just about Wordle).