GUID
Saturday, August 16, 2008 1:47:06 PM
Is my math off? Let's consider two scenarios. In all cases, we'll assume a GUID takes 100 cycles to compute. It may be more. It may be less. But even if I'm off by a factor of 100, you only need 100 times as many machines or you'll need machines that are 100 times faster which will happen in about 10 years.
Before getting to the two scenarios, we have to consider what it takes to create a collision. Most would think that since a GUID is 128 bits, the odds of a collision would be 1/(2128). Not so. That's the probability of a collision on ONE GUID. The probability that you'll produce an identical GUID as another that is already out there goes up dramatically. So much so that in order to achieve odds of 50% of producing a collision, you actually need sqrt(2128) = 264 GUID's. The address space of GUID's has significantly diminished. To understand this, one must know about the birthday paradox.
Our objective is now to produce 264 GUID's.
Scenario #1
Year: 1995
CPU: Intel Pentium
Speed: 100 Mhz
100,000,000/100 = 1,000,000 GUID's per second.
Time required for 264 GUIDs
Seconds: 18446744073709.55
Minutes: 307445734561.82
Hours: 5124095576.03
Days: 213503982.33
Years: 584942.42
We're safe for 585,000 years. Sounds good, eh? Well, you only need 585,000 Pentium machines from 1995 to get a collision within a year. Doesn't sound so good to me though it's certainly nothing to fret over.
Scenario #2
Year: 2005
CPU: Intel Pentium 4
Speed: 3.8Ghz
3,800,000,000/100 = 38,000,000 GUID's per second.
Time required for 264 GUIDs
Seconds: 485440633518.67
Minutes: 8090677225.31
Hours: 134844620.42
Days: 5618525.85
Years: 15393.22
15,000 years! Still great, no?
In ten years, we went from requiring 585,000 years to 15,000 years. How did that happen? That's 39 times shorter than in 1995. Let's see what will happen in 2015. We know processor speed doubles every 18 months. And GUID creation is especially conducive to multicore processing. So in ten years, we'll get 7 doublings in speed. 4Ghz*27 = 512Ghz. Will we have 128 or 256 core computers in 2015? Probably not. But it won't be far off. Maybe quantum computing will take off? What we know is that we'll get this kind of speed sooner or later. So let's look at this future scenario.
Scenario #3
Year: 2015
CPU: Maas Biolabs
Speed: 512Ghz
512,000,000,000/100 = 5,120,000,000 GUID's per second.
Time required for 264 GUIDs
Seconds: 3602879701.90
Minutes: 60047995.03
Hours: 1000799.92
Days: 41700.00
Years: 114.25
114 years! Still sounds good? Uh, NO! It means you need 114 machines running for a year to get 50% chance of collision. Not only that, but once you cross that barrier, there is no going back. That's the thing to remember with GUID's. Once they are put in use, the odds of a collision get higher and higher. These next 10 years brought 135 times the progress of the last 10 as far as GUID production. It'll get even worse after that.
There's another issue to take into account. Right now, there aren't many reasons to share GUID's all over the world between different applications. If you use a GUID, it's usually within one application. Once distributed computing takes off, pieces of data and code objects will have to be identified. These can be created in a nanosecond. You'll have billions and billions of these being produced per second all over the world. Under such a scenario, you'll get a guaranteed collision in under a second.
The only two things keeping GUID's in use today is the fact that most of their uses are in different applications and that machines and their software aren't fully distributed on a massive scale yet.
Are my numbers off? Why do the 'best experts' in the field still say that GUID's are great and there's no need to check for collisions or even worry about it. The odds they mention also seem very weird to me. I don't understand where they get their numbers. I often see static odds. But collisions become more likely the more GUID's you have. The odds cannot possibly be static. So either I'm missing a big piece of the puzzle or these experts are seriously misinformed.


Unregistered user # Saturday, August 16, 2008 7:09:20 PM
Unregistered user # Saturday, August 16, 2008 8:27:36 PM
Unregistered user # Saturday, August 8, 2009 2:33:41 AM
Vorlath # Saturday, August 8, 2009 3:23:27 AM
The speed increase wasn't the point. It was an analogy as to how much the industry would start needing exponentially more ID's as time goes on. If someone decides to use GUID's to ID everything from albums to cooking recipes, we'll run out of GUID's pretty quickly.
Unregistered user # Monday, August 17, 2009 2:55:28 PM