VistaDB 6 Introduces Shortcut Evaluation
For more information on what’s new in VistaDB 6 see What’s Coming in VistaDB 6
SQL is a very flexible data formatting language. Representing results could be as simple as:
SELECT Id FROM dbo.User_Profile;
Which just copies the Id value with its existing data type from the table row into the result or could get very creative, like:
SELECT
CASE WHEN Added_Dt < '1/1/2018'
THEN Full_Name
ELSE CAST(First_Name + ' ' + Last_Name as VARCHAR(255))
END AS Full_Name
FROM dbo.User_Profile;
In this case we have to determine which output path to go down based on one column’s value, then possibly concatenate two column together and override their type before returning the value.
VistaDB has to cater for all of the different options that can be represented in T-SQL. To do this, the engine has an object structure set up when preparing the query and evaluated each time the current row is changed.
Lets Cut to the Chase
While carefully profiling the engine’s performance we started finding cases where the simple act of checking whether an option was enabled or not, or just calling into another function, took significant time. This highlights a challenge with VistaDB - because it’s running at such a low level things that wouldn’t normally be significant can have an outsized impact on performance.
As an experiment, we modified the engine to have a bypass for the simplest case - if a column is being taken as is from the table and presented back, we grab a delegate to the lowest method that gets the raw value and just execute that directly.
The performance improvement was surprising - around 25% in the case of selecting all of the columns and rows from a table with 36 columns. This result is all the more encouraging because it applies in many common cases: Filling data tables and most ORM queries.
Release the Hounds
We’re building on this initial success by introducing a strategy that looks for any opportunity to skip levels of the object hierarchy even in cases where the cost for each level is small because the cumulative effect can be substantial. The net result is that simple queries will benefit the most whereas the most complicated column operation will perform as it did before (because it needs every level of the original approach).
For VistaDB 6 we’re targeting the most common scenarios, but we’ll be looking in future minor releases for additional places to apply this and gain even more performance.