What’s the difference between an “encoding,” a “character set,” and a “code page”?

A ‘character set’ is just what it says: a properly-specified list of distinct characters. An ‘encoding’ is a mapping between a character set (typically Unicode today) and a (usually byte-based) technical representation of the characters. UTF-8 is an encoding, but not a character set. It is an encoding of the Unicode character set(*). The confusion …

Read more

Is a base64 encoded string unique?

Two years late, but here we go: The short answer is yes, unique binary/hex values will always encode to a unique base64 encoded string. BUT, multiple base64 encoded strings may represent a single binary/hex value. This is because hex bytes are not aligned with base64 ‘digits’. A single hex byte is represented by 8 bits …

Read more