Implementation details compact-string ==The encoding== Data.CompactString encodes characters using a variable length encoding, a single character is stored as 1, 2 or 3 bytes. The bytes are stored in a ByteString, so: > newtype CompactString = CS { unCS :: ByteString } The encoding used is not UTF-8, because that is rather inefficient. Characters in Unicode contain at most 20 bytes, these are encoded as:
| Range | Character | Encoded | ||||
|---|---|---|---|---|---|---|
| U+0000 - U+007F | 0zzzzzzz | 0zzzzzzz | ||||
| U+0080 - U+4000 | 00yyyyyy | yzzzzzzz | 1xxxxxxx | 0zzzzzzz | ||
| U+4000 - U+10FFFF | 000xxxxx | xxyyyyyy | yzzzzzzz | 1xxxxxxx | 0yyyyyyy | 1zzzzzzz |