Unexpected behavior of binary conversions (COALESCE vs. ISNULL)
Can you comment on what approach shown below is preferable? I hope the question will not be blocked as "opinionated". I would like to believe there is an explanation that makes that clear.
Context: I have a code for mirroring 3rd party table contents to my own table (optimization). It worked some time flawlessly until the size/modification of the database reached some threshold.
The optimization is based on row version values of more tables, and remembering the maximum of the values from the source tables. This way I am able to update my local table incrementally, much faster than rebuilding it from time to time from scratch.
The problem started to appear when the row-version value exceeded the 4byte value. After some effort, I have spotted that the upper 4 bytes of the
binary(8) value were set to 0. Later, the suspect was found to have a form
The COALESCE was used to cover the case when the local table is fresh, containing now data — for comparing the
MAX(row_version) of source tables with something meaningful.
The examples to show the bug: To simulate the last mentioned situation, I want to convert the NULL value of the
binary(8) column to 1. I am adding also the
ISNULL usage that was added later. The original code contained the COALESCE only.
DECLARE @bin8null binary(8) = NULL SELECT 'bin NULL' AS the_variable, @bin8null AS value SELECT 'coalesce 1' AS op, COALESCE(@bin8null, 1) AS good_value SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(@bin8null, 1)) AS good_value SELECT 'isnull 1' AS op, ISNULL(@bin8null, 1) AS good_value SELECT 'isnull 0x1' AS op, ISNULL(@bin8null, 0x1) AS bad_value
(There is a bug in the image
coalesce 0x1 + convert fixed later in the code to
coalesce 1 + convert, but not fixed in the image.)
The application bug appeared when the binary value was bigger than the part that could be stored in 4 bytes. Here the
0xAAAAAAAA was used. (Actually, the
0x00000001 was the case, and it was difficult to spot that the single 1 was changed to 0.)
DECLARE @bin8 binary(8) = 0xAAAAAAAA01BB3A35 SELECT 'bin' AS the_variable, @bin8 AS value SELECT 'coalesce 1' AS op, COALESCE(@bin8, 1) AS bad_value SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(@bin8, 1)) AS bad_value SELECT 'coalesce 0x1 + convert ' AS op, CONVERT(binary(8), COALESCE(@bin8, 0x1)) AS good_value SELECT 'isnull 1' AS op, ISNULL(@bin8, 1) AS good_value SELECT 'isnull 0x1' AS op, ISNULL(@bin8, 0x1) AS good_value
Description — my understanding: The
COALESCE() seems to derive the type of the result from the type of the last processed argument. This way, the non-NULL
binary(8) was converted to
int, and that lead to the loss of upper 4 bytes. (See the 2nd and 3rd red
bad_value on the picture. The difference between the two cases is only in decimal/hexadecimal form of display.)
On the other hand, the
ISNULL() seems to preserve the type of the first argument, and converts the second value to that type. One should be careful to understand that
binary(8) is more like a series of bytes. The interpretation as one large integer is only the interpretation. Hence, the
0x1 as the default value does not expand as 8bytes integer and produces bad value.
My solution: So, I have fixed the bug using
ISNULL(MAX(row_version), 1). Is that correct?
This is not a bug. They’re documented to handle data type precedence differently.
COALESCE determines the data type of the output based on examining all of the arguments, while ISNULL has a more simplistic approach of inspecting only the first argument. (Both still need to contain values which are all compatible, meaning they are all possible to convert to the determined output type.)
Returns the data type of expression with the highest data type precedence.
ISNULL topic does not make this distinction in the same way, but implicitly states that the first expression determines the type:
replacement_value must be of a type that is implicitly convertible to the type of check_expression.
I have a similar example (and describe several other differences between
ISNULL) here. Basically:
DECLARE @int int, @datetime datetime; SELECT COALESCE(@int, CURRENT_TIMESTAMP);
— works because datetime has a higher precedence than the chosen output type, int
GO DECLARE @int int, @datetime datetime; SELECT ISNULL(@int, CURRENT_TIMESTAMP);
— fails because int, the first (and chosen) output type, has a lower precedence than datetime
Msg 257, Level 16, State 3
Implicit conversion from data type datetime to int is not allowed. Use the CONVERT function to run this query.
Let me start of by saying:
This is not a "bug".
COALESCE are not the same function, and operate quite differently.
ISNULL takes 2 parameters, and returns the second parameter if the first has a value
NULL. If the 2 parameters are different datatypes, then the dataype of the first datatype is returned (implicitly casting the second value).
COALESCE takes 2+ parameters, and returns the first non-
COALESCE is a short hand
CASE expression, and uses Data Type Precendence to determine the returned data type.
As a result, this is why
ISNULL returns what you expect, there is no implicit conversion in your query for the non-
COALESCE there is implicit conversion.
binary has the lowest precedence of all the data types, with a rank of 30 (at time of writing). The value
1 is an int, and has a precedence of 16; far higher than 30.
As a result
COALESCE(@bin8, 1) will implicitly convert the value
0xAAAAAAAA01BB3A35 to an
int and then return that value. You see this as
SELECT CONVERT(int,0xAAAAAAAA01BB3A35) returns
29047349, which your first "bad" value; it’s not "bad", it’s correct for what you wrote.
Then for the latter "bad" value, we can convert that
int value (
29047349) back to a
binary, which results in
0x0000000001BB3A35, which is, again the result you get.
TL;DR: checking return types of functions is important.
ISNULL returns the data type of first parameter and will implicitly convert the second if needed. For
COALESCE it uses Data Type Precedence, and will implicitly convert the returned value to the data type of with the highest precedence of all the possible return values.