Unicode Surrogate Pair Characters, including Emojis
Jade supports 16‑bit Unicode code characters that are greater than 0xFFFF, which includes emojis, mathematical symbols, and others.
The term emoji in the Jade Platform product information includes any type of surrogate pair characters. Such characters:
-
Are stored as surrogate pair characters.
-
Require the use of two 16‑bit values to store the value. These Unicode values are stored as two encoded characters.
The actual value of the original Unicode character is split into two and stored in the lower part of each of the two characters, with 0XD800 added to the first 16‑bit character and 0xDC00 added to the second. For example, the Unicode value 0x1F783 that represents an emoji is stored as 0xD83c and 0XDF83.
Jade handles surrogate pair characters; in particular, the conversion from String to StringUtf8 and StringUtf8 to String primitive types. As a result of this, emoji characters can be included in text in a Unicode Jade system with some limitations.
The following code fragment is an example of Unicode surrogate pair character handling.
strUtf8 := #[f0 9f 8e 85].Binary.StringUtf8; // Father Christmas emoji str := strUtf8.String; write str; write strUtf8;
A Unicode UTF‑8 encoding table and emoji characters can be found at https://www.utf8-chartable.de/unicode-utf8-table.pl?start=127872.
The following is a list of notes about surrogate pair emoji usage in Jade.
-
Emojis can be used only in a Unicode Jade system because they cannot be represented in ANSI.
-
Emojis can be represented using StringUtf8 in an ANSI Jade system, but they cannot be converted to a String value as there is no ANSI representation.
-
Emojis can be included in any text displayed or printed, except for the rich text restriction later in this list.
-
Emojis can be copied and pasted to and from TextBox controls or pasted into the editor pane.
The emoji selection window (displayed using the Windows key + period (.), or dot, character key combination) can be used to paste a selection.
-
Emoji characters can be converted to and from a StringUtf8 primitive type, which means that can be included in web text data.
-
An emoji character cannot be stored in a Character property because it requires two characters, so it will not fit.
-
Emojis can be made up of multiple surrogate pair values that are overlayed on each other to get the final displayed representation. Using an arrow key can therefore leave the cursor positioned in the same place and may require multiple presses to step over each part of the emoji.
-
The Jade length of a string that includes emojis is the number of 16‑bit characters, not the number of displayed characters.
-
Locate emoji characters in text by checking whether the first character is greater than 0xD800 and the second character is greater than 0xDC00.
-
The JadeRichText control, which is a Microsoft control, does not support emoji characters.
2020.0.02 and higher