Fast Summaries Re-revisited

My recent “Virtual List on Steroids” presentation at DIG-FM, and the preparation of a revised version to give at dotFMP next week, has prompted me to reconsider certain assumptions re: optimizing performance both locally, and across a LAN and/or WAN.

2018-05-29_094509

Specifically, there are two things we’re going to look at today. First, a way to dramatically speed up sorting on related data. Then, having incorporated that trick into the Fast Summary approach, we will compare Fast Summaries vs. Multi-Finds under various scenarios.

Demo files (structurally identical):

Sorting On A Related Field

In the smaller of the two demo files, we have 80,000 “order” (purchase order) records and we want to sort them alphabetically by vendor name, which lives in a related “vendor” table containing 1,500 records… and as you might surmise, these two tables are linked via a standard foreign-key-to-primary-key relationship, i.e.,

2018-05-29_174800.png

Here are field defs for the two tables:

2018-05-29_20-23-13

2018-05-29_20-25-15

To avoid skewed results due to caching, we’re going to close and then re-open the file between each sort. First let’s do the obvious, and sort our orders by the related vendor name like so:

2018-05-29_175660.png

It takes a glacial 12+ seconds on my aging Windows 7 laptop, and this is with the file running locally, not across a LAN or WAN… but that’s not particularly surprising since we’re asking FileMaker to sort 80,000 records on a related field (as opposed to sorting on a field in the same table).

Okay, let’s exit and restart the demo. Now what about the fast sort?

2018-05-29_190847.png

Don’t blink or you’ll miss it. The massive speed improvement is due to sorting on a local rather than on a related field… however… taking another look at field defs…

2018-05-29_20-23-13

…we are not redundantly storing the vendor name, or any other de-normalized data, in the order table. So what the heck is going on?

Here’s the trick: sort on the foreign key (id_vendor) in the order table, but instead of doing a standard ascending sort, apply a custom order based on a value list called “vendor id and name”…

2018-05-29_230209.png

…which is defined like so:

2018-05-29_190530

Bottom line: this simple trick enables “related data” sorts to run 10x faster.

Fast Summaries vs. Multi-Finds

2018-05-27_174520

Mikhail Edoshin’s “Fast Summary” technique has previously been the subject of two articles here on FileMaker Hacks, first in 2011 and then again in 2016…

Briefly, the two approaches work as follows:

Fast Summary (a.k.a. FS)

  • Locate records in the Order table
  • Sort by vendor id in the Order table (using the trick mentioned above)
  • Loop through the found set stopping once on each unique vendor record, push summarized values into variables (using GetSummary)
  • Aggregate the variables into a global text field

Multi-Find (a.k.a MF)

  • Grab a list of vendor ids (sorted in vendor name order)
  • Loop through this list, do a find in the Order table on each vendor id, and push summarized values into variables (values come directly from summary fields)
  • Aggregate the variables into a global text field

As mentioned at the outset, the Fast Summary technique in today’s demos takes advantage of the custom value list sort trick, and is competitive with Multi-Find under the following test scenarios. Here are results from tests performed on a brand spanking new iMac, as well as on my aging Windows 7 laptop.

2018-05-31_121409

Note that on both platforms, Fast Summaries are a winner with server-side scripts. And if you decide to host the demo files on either FM Server or FM Cloud, you will be given the option to perform the script client side or server side.

2018-05-30_004840.png

(Unless you’re feeling masochistic, I don’t recommend running the reports client side when server side is an option.)

Looking at the above test results for FM Cloud, given that the reports are generated server side, you might wonder why they take so much longer than the local equivalents. The explanation is that there’s no free lunch when it comes to passing a large script result from the server back to the client across a WAN… the data is aggregated and assembled very quickly, but we have to wait for the script result before we can display it.

A Few Closing Thoughts

One thing I particularly like about Fast Summaries is that they can be used with any found set as a starting point (see Fast Summaries Revisited for more on this), whereas Multi-Finds require that you script your search criteria in advance. Fast Summaries are more flexible, as well as a bit less work to set up, and they perform better server-side if you can get away with sorting on same-table fields only.

Both of these techniques can be used to aggregate data for display using Virtual List, as well as a variety of other purposes (e.g., as part of a routine to generate JSON), and I recommend testing to see which is better suited to your current requirements.

 

9 thoughts on “Fast Summaries Re-revisited

  1. Daniel Wood

    I must have been late to the party, but I’ve just read about your sorting trick for sorting with related data and am blown away, that is very very VERY awesome!

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.