Windows Phone 7 Developer Launch - Learn More
kick it on DotNetKicks.com   Shout it  

Sequential Guid Algorithm – Implementing Combs in .NET

Today Ayende released his sequential guid answer, which is something I had been working on as well. I’ll be honest and say the current implementation takes from his recent posts (previous to today’s). If it weren’t for him, I would not have known about the Sql Server sorting idiosyncrasies – at least not until much later when we were experiencing serious index fragmentation.

You can see my implementation below. Realize that this implementation was written only with Sql Server in mind, so I don’t use any other byte order than will work with Sql Server as his does. Also, I am not using any kind of incrementing counter as it is not worth the effort in my case. Ayende’s context is that of a single system – he isn’t looking for something that would be sequential globally. His guids will always be sequential, even under high load (more than 1,000/second). However, they are not guaranteed to be sequential across system boundries. Neither is mine, but since my implementation will is not intended to guarantee sequentialness (yeah I made word up) for guids generated for the same value of DateTime.UtcNow.Ticks there is no reason (at the moment) for me to add the overhead.

I made the choice because my Comb type is meant to be used at the client and sent to the server for eventually consistent scenarios. And currently the load on our servers doesn’t justify the effort of trying to guarantee a sequence across systems. I thought about using something like the MAC address as suggested by Ayende, but that still doesn’t guarantee that guids generated on different system will be sequential, just grouped. So for the time I am willing to allow a small amount of fragmentation (20% by my own tests) in high-throughput scenarios because those will be rare for the system I am developing.  

Also, because my Comb type is meant to be used by the client and passed to the service, I wanted to be able to verify that the value sent by the client was actually generated by Comb, I set the first “nibble” (half of the byte) of the 6th byte to 1100. This borrows from the method used by Microsoft, as mentioned in Jimmy Nilsson’s original article, of setting the first nibble of the 7th byte (the one I am copying to the 6th byte) to 0100. This could end in false positives if someone were to use some other method besides Guid.NewGuid() to gernate their guids, but since it won’t happen in the system this was developed for I don’t need the overhead of additional checks, yet.

Update 6/15/2011

While researching some details for providing a Java implementation of my Comb class, I read on Wikipedia that the bytes I am referring to in the previous paragraph are actually standard bytes used by all Guid/Uuid implementations to set the version. This indicates which version was used to generate the value. So on one hand, I am using these bits as they were intended. However, on the other hand there is the possibility of running into version number conflicts down the road if there is a version which clashes with mine which will result in false positives. At the moment the highest version number is 5, 0100 refers to version 4 and my version is declared as 12. So I‘m safe for at least 7 more versions :).

I am curious however to hear from Ayende about why his is reversing his byte arrays. Let me know what you think of my version vs his. (I consider him to be beyond me in skill so I’m not asking for whose is better, I just would like some feedback in comparison).

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
 
namespace Developmentalmadness.Core.Types
{
    /// <summary>
    /// A class for generating globally sequential Guids
    /// </summary>
    public class Comb
    {
        /// <summary>
        /// Creates a new sequential guid (aka comb) <see cref="http://www.informit.com/articles/article.aspx?p=25862&seqNum=7"/>.
        /// </summary>
        /// <remarks>A comb provides the benefits of a standard Guid w/o the database performance problems.</remarks>
        /// <returns>A new sequential guid (comb).</returns>
        public static Guid NewComb()
        {
            byte[] uid = Guid.NewGuid().ToByteArray();
            byte[] binDate = BitConverter.GetBytes(DateTime.UtcNow.Ticks); // use UTC now to prevent conflicts w/ date time savings
 
            // create comb in SQL Server sort order
            byte[] comb = new byte[uid.Length];
 
            // the first 7 bytes are random - if two combs
            // are generated at the same point in time
            // they are not guaranteed to be sequential.
            // But for every DateTime.Tick there are
            // 72,057,594,037,927,935 unique possibilities so
            // there shouldn't be any collisions
            comb[3] = uid[0];
            comb[2] = uid[1];
            comb[1] = uid[2];
            comb[0] = uid[3];
            comb[5] = uid[4];
            comb[4] = uid[5];
            comb[7] = uid[6];
 
            // set the first 'nibble of the 7th byte to '1100' so 
            // later we can validate it was generated by us
            comb[6] = (byte)(0xc0 | (0xf & uid[7]));
 
            // the last 8 bytes are sequential,
            // these will reduce index fragmentation
            // to a degree as long as there are not a large
            // number of Combs generated per millisecond
            comb[9] = binDate[0];
            comb[8] = binDate[1];
            comb[15] = binDate[2];
            comb[14] = binDate[3];
            comb[13] = binDate[4];
            comb[12] = binDate[5];
            comb[11] = binDate[6];
            comb[10] = binDate[7];
 
            return new Guid(comb);
        }
 
        /// <summary>
        /// Validates if comb was generated by this class
        /// </summary>
        /// <remarks>
        /// Guids generated by Guid.NewGuid() have a value of 
        /// 0100 for the first 4 bits of the 7th byte. Ours will
        /// have a value of 1100 for the 6th byte. We're checking that here.
        /// 
        /// We could do additional validation by verifying that
        /// the value of a new Guid is greater than the
        /// one being validated (or that the last 6 bytes
        /// resolve to a valid DateTime), but this should
        /// be enough for now.
        /// </remarks>
        public static bool IsComb(Guid value)
        {
            // get the 7th byte
            byte b = value.ToByteArray()[6];
 
            // make sure the first 'nibble' == 1100
            return (0xc0 & b) == 0xc0;
        }
 
        /// <summary>
        /// Validates Guid to determine the supplied
        /// value was generated by Comb.NewComb. If
        /// invalid an ArgumentException is thrown.
        /// </summary>
        /// <param name="value"></param>
        public static void ValidateComb(Guid value)
        {
            if (!Comb.IsComb(value))
                throw new ArgumentException("The supplied Id value was not generated by Comb.NewComb.");
        }
    }
}

 

tags: , , , ,

kick it on DotNetKicks.com   Shout it  

Feedback

# Sequential Guid Algorithm

Gravatar Sequential Guid Algorithm 10/13/2010 2:40 PM | DevelopMENTAL Madness

Comments have been closed on this topic.
 

 

Copyright © Mark J. Miller