--- obj: concept repo: https://github.com/ulid/spec aliases: ["Universally Unique Lexicographically Sortable Identifier"] --- # ULID (Universally Unique Lexicographically Sortable Identifier) UUID can be suboptimal for many use-cases because: - It isn't the most character efficient way of encoding 128 bits of randomness - UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address - UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures - UUID v4 provides no other information than randomness which can cause fragmentation in many data structures Instead, herein is proposed ULID: ```javascript ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV ``` - 128-bit compatibility with UUID - 1.21e+24 unique ULIDs per millisecond - Lexicographically sortable! - Canonically encoded as a 26 character string, as opposed to the 36 character UUID - Uses Crockford's base32 for better efficiency and readability (5 bits per character) - Case insensitive - No special characters (URL safe) - Monotonic sort order (correctly detects and handles the same millisecond) ## Specification Below is the current specification of ULID as implemented in [ulid/javascript](https://github.com/ulid/javascript). *Note: the binary format has not been implemented in JavaScript as of yet.* ``` 01AN4Z07BY 79KA1307SR9X4MV3 |----------| |----------------| Timestamp Randomness 48bits 80bits ``` ### Components **Timestamp** - 48 bit integer - UNIX-time in milliseconds - Won't run out of space 'til the year 10889 AD. **Randomness** - 80 bits - Cryptographically secure source of randomness, if possible ### Sorting The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed ### Canonical String Representation ``` ttttttttttrrrrrrrrrrrrrrrr where t is Timestamp (10 characters) r is Randomness (16 characters) ``` #### Encoding Crockford's Base32 is used as shown. This alphabet excludes the letters I, L, O, and U to avoid confusion and abuse. ``` 0123456789ABCDEFGHJKMNPQRSTVWXYZ ``` ### Monotonicity When generating a ULID within the same millisecond, we can provide some guarantees regarding sort order. Namely, if the same millisecond is detected, the `random` component is incremented by 1 bit in the least significant bit position (with carrying). For example: ```javascript import { monotonicFactory } from 'ulid' const ulid = monotonicFactory() // Assume that these calls occur within the same millisecond ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRZ ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS0 ``` If, in the extremely unlikely event that, you manage to generate more than $2^{80}$ ULIDs within the same millisecond, or cause the random component to overflow with less, the generation will fail. ```javascript import { monotonicFactory } from 'ulid' const ulid = monotonicFactory() // Assume that these calls occur within the same millisecond ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRY ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRZ ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS0 ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS1 ... ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZX ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZY ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZZ ulid() // throw new Error()! ``` #### Overflow Errors when Parsing Base32 Strings Technically, a 26-character Base32 encoded string can contain 130 bits of information, whereas a ULID must only contain 128 bits. Therefore, the largest valid ULID encoded in Base32 is `7ZZZZZZZZZZZZZZZZZZZZZZZZZ`, which corresponds to an epoch time of `281474976710655` or $2^{48}-1$. Any attempt to decode or encode a ULID larger than this should be rejected by all implementations, to prevent overflow bugs. ### Binary Layout and Byte Order The components are encoded as 16 octets. Each component is encoded with the Most Significant Byte first (network byte order). ``` 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 32_bit_uint_time_high | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 16_bit_uint_time_low | 16_bit_uint_random | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 32_bit_uint_random | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 32_bit_uint_random | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ```