Unique Records Revisited, part 2

When you consider how easily most common reporting tasks are accomplished in FileMaker Pro, the lack of a built in, clear cut method to count unique values within a given found set seems a bit surprising. But where there’s a will, there’s a way, and in part 1 we looked at a method that required the found set be sorted. Well I have some good news: today’s demo (Count Unique – Three Variations, 6.4 Mb) has no sorting requirement.

Here we have a flat file (single table) of contact information. For any given found set, the primary key (ID) will of course be unique, but each of the other fields will potentially contain duplicates, and we want to be able to quickly count the unique values without having to first sort the found set.

The basic approach can be summarized in a few words:

  1. Build a multiline key (MLK) of IDs for all records in the found set.
  2. Relate the MLK to the ID field.
  3. Create a conditional value list based on this relationship.
  4. Count the values in the value list.

Step 1 requires a bit of elaboration, so I’d like to pretend for a moment that step 1 has already happened, and we have a return-delimited stack of IDs sitting in a global text field, g_MLK.

To get a sense of the big picture, we’ll look at steps 2, 3 and 4 now, and then circle back to step 1 for an in-depth exploration.

In step 2 we relate g_MLK to ID (the primary key field in Contacts), like so:

Since at this point the ID stack in g_MLK has a one-to-one correspondence with the records in the found set, if we are sitting on a layout based on the Contacts table occurrence, and we look across the relationship, we will see all the records in the current found set. We can easily confirm this by comparing Get(FoundCount) with Count(Contacts Filter::ID), or by comparing the contents of g_MLK with List(Contacts Filter::ID).

If you aren’t familiar with multiline keys, they can seem a bit like magic at first. Normal key fields will only contain one value. But if you place a return delimited list of keys into a text field, and use it as a relational predicate, any entry in that list can produce a valid relational match. I like to think of it as the relational equivalent of an “or” find.

Now comes step 3, and we actually need to define seven value lists, because there are seven fields we want to count unique values for. Since field-based value lists only show unique values, they are ideally suited for our needs (the reason they only show unique values is they derive their values from field indexes, rather than field contents — which explains why you cannot base a value list on an unindexed field).

Each value list will be based on a different field… here’s the setup for “Found Set Cities”. We want the value list to be filtered by the IDs sitting in g_MLK, so we need  make sure that we have:

  1. Specified the related table occurrence (Found Set Filter)
  2. Specified the correct field (in this case, City)
  3. Chosen “Include only related starting from: Contacts”

This type of value list is variously known as a “filtered”, “conditional” or (less commonly) a “related” value list.

And in step 4, we count the values in each value list, using the ValueCount and ValueListItems functions (each “Unique_” field is a global number field) :

…with a result such as the one you see at the left. But what about step 1? Well, enough procrastination, let’s get to it. Building a multiline key of values from the current found set it really a topic in itself, with numerous methods to consider, which is why I saved it for last.

As its name implies, today’s demo (Count Unique – Three Variations, 6.4 Mb) uses three different techniques to build the MLK. At the risk of stating the obvious: the ideal approach would be the top speed performer in all hosting scenarios (stand-alone, LAN, or WAN), regardless of found set size or platform.

Of the three approaches in this demo, the “Copy All Records” method has been around since the 1990s, and of course the AppleScript method is Mac-only, but certainly worth taking a look at. The AppleScript and Export/Import techniques come from a demo file (FastVariables) put together by Ralph Learmont, and modified by Bruce Robertson, specifically to demonstrate the various methods of loading the IDs for the current found set into a global field. I strongly recommend downloading FastVariables and experimenting with the various methods… if possible, on a LAN and WAN, not just locally.

And now, ladies and gentlemen, after hours of testing, and discussion with a number of colleagues, it appears that there is actually a clear winner under all circumstances: the unambiguous performance winner is the Export/Import technique from Ralph Learmont. If this were a chess tournament, I would nominate this technique for a “brilliancy prize”, because it is a fine example of outside-the-box thinking, taking advantage of several obscure FileMaker behaviors. Let’s take a look at this portion of the “count unique” script:

It starts out innocuously enough, by exporting the IDs for the found set as a text file, Exp.txt, and placing it into FileMaker’s temp folder.

Next a $folderpath variable is assigned, and one might wonder, “What the heck is that  about? We already specified the complete path for the export file. And why does the Import Records step refer to $folderpath instead of $filepath?”

Well, that’s what is so clever about Ralph’s approach. If he were simply doing a standard “file” import, FileMaker would attempt to insert one line from the text file into each record. Instead, Ralph is utilizing FileMaker’s ability to import a whole folder of either text or media files. Observant readers may be saying, “But we don’t have a whole folder full of text files; only Exp.txt which we just exported.” Guess what? That’s exactly what we want.

We aren’t trying to import multiple files, we’re trying to ensure that the contents of Exp.txt becomes a return-delimited list in g_MLK, and that’s precisely what happens when we use this approach.

Incidentally, since we have multiple records in our found set, but a single text file to import, FileMaker would like to display a dialog to warn us that it did not have enough text files to populate all the records (at that moment, it doesn’t know or care that we’re importing into a global field). We don’t want our users to see that error, hence the use of Set Error Capture in the script.

Note: I do not recommend that the Export/Import method be used in FM 9 or earlier. The reason this article is tagged as “version FM 10 or later” is because of a quiet but very welcome behavior change introduced in that version: formerly FileMaker stored its temp files in the generic temp folder designated by the OS — now FileMaker creates its own temp folder (typically named S10)  inside the main temp folder. You can use the Get(TemporaryPath) function to see exactly where this folder is located. Since this folder is created for FileMaker’s exclusive use, a) FileMaker will purge its contents automatically on application exit, and b) it is highly unlikely there will be a text file sitting in that folder unless a FM script has placed one there.

A final comment about the demo file. FileMaker will cache the IDs when you click any of the buttons, so it’s imperative to close the file between each test if you want accurate timings. Once caching has taken place, all the methods will run substantially faster.

5 thoughts on “Unique Records Revisited, part 2

  1. John Galt

    How can I get Filemaker to perform a find that only returns unique records? This is a common and simple task in MySQL but I cannot find an easy way to do this in Filemaker.

    For example, let say I poll 500 people and I ask them their favorite ice cream flavor. I want to perform a find in the ice cream flavor field and see a list of all of the flavors…but I don’t want to see 200 records for chocolate, 50 for vanilla, etc. I just want to see a found set that single record for each flavor that was submitted.

    I realize that I can generate a report that displays this information but I want to do this with a standard find command. Is this possible?

    Reply
    1. Kevin Frank Post author

      Hi John, not directly with a single find command. You’d need to do some additional scripting to collapse the found set to just unique entries.

      Reply
    2. Kevin Frank Post author

      Or perhaps my previous response took your question (can you get the answer using a simple find request) too literally. What you care about is the list of unique values, so why not just view the index of the Flavor field? That would show you the unique values. And if you base a value list on the contents of the Flavor field, you can use the ValueListItems() function to make those values available to the calc engine.

      Reply
  2. Arild Schonberg

    Thank you Kevin!
    After many years of programming, this “out-of-the-box” thinking gives us new tools to look at the data.
    With an SQL-plugin or FM12, we can use SELECT DISTINCT to set the g_MLK with unique values.
    Does FM12’s implementation of SQL support this?

    Reply
    1. Kevin Frank Post author

      Hello Arild,

      Glad to hear you found the article useful. With regards to your question…

      The issue is that SQL (via plug-in or ExecuteSQL) has no concept of a FileMaker found set. I proposed a work around using a custom function called FoundSetToSQL which you can read about here:

      https://filemakerhacks.com/?p=5013

      Hope this helps,
      Kevin

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s