N + 1 query pattern

What is an N + 1 query?

An N + 1 query is an instance where a query contains another query - Get orders then get order lines. Its fairly common in legacy F&O code and is generally (now) considered bad practice to use. But seeing vs using it can be somewhat confusing. First, let's review.

Classic ( or Legacy ) - selects in loops (N+1 query pattern)

Below is an example of what this may look like:

while select salesLine
    where salesLine.SalesId == _salesId
{

    select firstOnly inventSum
        where inventSum.ItemId      == salesLine.ItemId
           && inventSum.InventDimId == salesLine.InventDimId;

   // do something
}

We're getting data to get more data. This can, and does, kill performance. We want to ask the database for all the data we need to act on in a single request whenever possible. Consider you're at a resturant and you're in a hurry. If you ask the waiter for a drink, a food selection from the money, say you're not having desert and to bring the check with the food, you're front loading all requests so the server won't keep coming back to ask for more instructions. We want to do something similar with the database. "here's what I want and here's how I want it". An example is below:

while select salesLine
    join inventSum
        where inventSum.ItemId      == salesLine.ItemId
           && inventSum.InventDimId == salesLine.InventDimId
       && salesLine.SalesId == _salesId
{
   // do something
}

If that's not an option, you can try doing some prefetch in a loop so you can process or analyze later. This would involve building a map with a key from a single query then looping on that without doing any additional selects, like so:

internal final class RunnableClass1
{
public static void main(Args _args)
{
SalesTable salesTable;
SalesLine salesLine;
InventSum inventSum;

Map sumByItemDim = new Map(Types::Container, Types::Real);

container key;
real availPhys;

//ie; focus on one sales order. replace as needed.
select firstOnly salesTable
where salesTable.SalesId == 'SO-000123';

if (!salesTable.RecId)
{
info("Sales order not found");
return;
}

//a single query builds the Map
while select ItemId, InventDimId, AvailPhysical
from inventSum
exists join salesLine
where salesLine.SalesId == salesTable.SalesId
&& salesLine.ItemId == inventSum.ItemId
&& salesLine.InventDimId == inventSum.InventDimId
{
key = [inventSum.ItemId, inventSum.InventDimId];

if (sumByItemDim.exists(key))
{
sumByItemDim.insert(key, sumByItemDim.lookup(key) + inventSum.AvailPhysical);
}
else
{
sumByItemDim.insert(key, inventSum.AvailPhysical);
}
}

//next, loop SalesLine with no additional selects
while select salesLine
where salesLine.SalesId == salesTable.SalesId
{
key = [salesLine.ItemId, salesLine.InventDimId];

if (sumByItemDim.exists(key))
{
availPhys = sumByItemDim.lookup(key);
info(strFmt("Line %1: %2 / %3 => AvailPhys=%4",
salesLine.LineNum,
salesLine.ItemId,
salesLine.InventDimId,
availPhys));
}
else
{
//not found in map = treat as 0 / missing
info(strFmt("Line %1: %2 / %3 => AvailPhys not found (0)",
salesLine.LineNum,
salesLine.ItemId,
salesLine.InventDimId));
}
}
}

}

With this, we're still hitting 2 data sources but we're not looping them one inside another. The looping of queries is what (typically) causes performance issues. Lastly, if you have to have a unique workload of need that does require some strange interactions with the data, create a batch job for it. we have several types of batch job types for different types of workloads.