Unicode Surrogate Pair Characters, including Emojis

JADE supports 16‑bit Unicode code characters that are greater than 0xFFFF, which includes emojis, mathematical symbols, and others.

The term emoji in JADE product information includes any type of surrogate pair characters. Such characters:

The actual value of the original Unicode character is split into two and stored in the lower part of each of the two characters, with 0XD800 added to the first 16‑bit character and 0xDC00 added to the second. For example, the Unicode value 0x1F783 that represents an emoji is stored as 0xD83c and 0XDF83.

JADE handles surrogate pair characters; in particular, the conversion from String to StringUtf8 and StringUtf8 to String primitive types. As a result of this, emoji characters can be included in text in a Unicode JADE system with some limitations.

The following code fragment is an example of Unicode surrogate pair character handling.

strUtf8 := #[f0 9f 8e 85].Binary.StringUtf8; // Father Christmas emoji
str := strUtf8.String;
write str;
write strUtf8;

A Unicode UTF‑8 encoding table and emoji characters can be found at https://www.utf8-chartable.de/unicode-utf8-table.pl?start=127872.

The following is a list of notes about surrogate pair emoji usage in JADE.

2020.0.02 and higher