Implementation details compact-string ==The encoding== Data.CompactString encodes characters using a variable length encoding, a single character is stored as 1, 2 or 3 bytes. The bytes are stored in a ByteString, so: > newtype CompactString = CS { unCS :: ByteString } The encoding used is not UTF-8, because that is rather inefficient. Characters in Unicode contain at most 20 bytes, these are encoded as:
Range Character Encoded
U+0000 - U+007F

0zzzzzzz

0zzzzzzz
U+0080 - U+4000
00yyyyyyyzzzzzzz
1xxxxxxx0zzzzzzz
U+4000 - U+10FFFF 000xxxxxxxyyyyyyyzzzzzzz 1xxxxxxx0yyyyyyy1zzzzzzz
The tagging bits might seem strange at first. They are chosen so that the character can be read both front-to-back and back-to-front. See Data.CompactString.Base for details of the encoding. ==More to come?== ''todo''