Earlier, I blogged on updating table with random numbers in SQL Server, later one of my friend asked me how guarantee that NEWID() can NOT generate duplicate value in SQL Server? The question seems to be a very valid and tried to find an answer for the same and here is the excerpt & explanation of my analysis and findings.
The NEWID function in Microsoft SQL Server Transact-SQL returns standard random version-4 UUIDs, while the NEWSEQUENTIALID function returns 128-bit identifiers similar to UUIDs which are committed to ascend in sequence until the next system reboot. A version 4 UUID is randomly generated. As in other UUIDs, 4 bits are used to indicate version 4, and 2 or 3 bits to indicate the variant (102 or 1102 for variants 1 and 2 respectively). Thus, for variant 1 (that is, most UUIDs) a random version-4 UUID will have 6 predetermined variant and version bits, leaving 122 bits for the randomly generated part, for a total of 2122, or 5.3×1036 (5.3 undecillion) possible version-4 variant-1 UUIDs.
The important excerpt from the screenshot is – “the probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.”
Just a bit on algorithm used for version 4 as below. Algorithms for Creating a UUID from Truly Random or Pseudo-Random Numbers. The version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers.
The algorithm is as follows:
o Set the two most significant bits (bits 6 and 7) of the clock_seq_hi_and_reserved to zero and one, respectively.
o Set the four most significant bits (bits 12 through 15) of the time_hi_and_version field to the 4-bit version number from Section 4.1.3.
o Set all the other bits to randomly (or pseudo-randomly) chosen values.
Long story in short: NEWID() can generate duplicate value, however, the probability is one in a billion which is very negligible.
I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!