UUID
returns a universal unique identifier (hopefuly also unique if imported to another DB as well).
To quote from MySQL doc (emphasis mine):
A UUID is designed as a number that is globally unique in space and
time. Two calls to UUID() are expected to generate two different
values, even if these calls are performed on two separate computers
that are not connected to each other.
On the other hand a simply INT
primary id key (e.g. AUTO_INCREMENT) will return a unique integer for the specific DB and DB table, but which is not universally unique (so if imported to another DB chances are there will be primary key conflicts).
In terms of performance, there shouldn’t be any noticeable difference using auto-increment
over UUID
. Most posts (including some by the authors of this site), state as such. Of course UUID
may take a little more time (and space), but this is not a performance bottleneck for most (if not all) cases. Having a column as Primary Key
should make both choices equal wrt to performance. See references below:
- To
UUID
or not toUUID
? - Myths,
GUID
vsAutoincrement
- Performance:
UUID
vsauto-increment
in cakephp-mysql UUID
performance in MySQL?- Primary Keys:
ID
s versusGUID
s (coding horror)
(UUID
vs auto-increment
performance results, adapted from Myths, GUID
vs Autoincrement
)
UUID
pros / cons (adapted from Primary Keys: ID
s versus GUID
s)
GUID
Pros
- Unique across every table, every database, every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate
ID
s anywhere, instead of having to roundtrip to the database- Most replication scenarios require
GUID
columns anyway
GUID
Cons
- It is a whopping 4 times larger than the traditional 4-byte index value; this can have serious performance and storage implications if
you’re not careful- Cumbersome to debug (
where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
)- The generated
GUID
s should be partially sequential for best performance (eg,newsequentialid()
on SQL 2005) and to enable use of
clustered indexes.
Note
I would read carefully the mentioned references and decide whether to use UUID
or not depending on my use case. That said, in many cases UUID
s would be indeed preferable. For example one can generate UUID
s without using/accessing the database at all, or even use UUID
s which have been pre-computed and/or stored somewhere else. Plus you can easily generalise/update your database schema and/or clustering scheme without having to worry about ID
s breaking and causing conflicts.
In terms of possible collisions, for example using v4 UUIDS (random), the probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.