There are several functions in SQL that are very useful such as the Lag() and Lead() functions. The lag function provides the opportunity to obtain a previous record from a specified field. The old way of doing this without the built-in lag function would have been to join a table to itself to retrieve the previous record.
In the following SQL lag example, we can see the use of this function to retrieve the previous invoice date field of a customer.
The function is used in a single field (column) and the format looks like this:
LAG( <field_name>, offset ) OVER( PARTITION BY <grouping_field> ORDER BY <field_name>) AS PreviousValue
Calculating Days Since
This gives us a pretty useful bit of information on a record and tells us if a customer was previous invoiced and how long ago it was. We may even add such a column as “Days Since Last Invoice”.
The resulting output provides us with detailed invoice information on our customers. If a previous date does not exist for a record the resulting output is NULL. This also means that our “days since” field also returns NULL. Some may want to default this value to ‘Unknown’ or the like.
Lag functions are very useful when retrieving dates, but can also be used to retrieve other values of importance. For example we might want to know such things as:
- Previous Salary
- Previous Title
- Previous Company
- Previous Invoice Amount
- Employee Raise Amount
- The list goes on…
As we can see, this function has a multitude of purposes and can provide very useful measures for reporting.
Finding The First Record
One last very useful purpose of the lag function is to find the first record of a set. For example, if we use our previous dataset we can include a WHERE clause looking for only where previous values are NULL.
The output now only returns the first invoice date of our customers since that was what our partition is based on.