Unicode Surrogate Pair Character Support (PAR 68066)
Jade now supports 16‑bit Unicode code characters that are greater than 0xFFFF, which includes emojis, mathematical symbols, and others.
The term emoji in the Jade Platform product information includes any type of surrogate pair characters. Such characters:
-
Are stored as surrogate pair characters.
-
Require the use of two 16‑bit values to store the value. These Unicode values are stored as two encoded characters.
The actual value of the original Unicode character is split into two and stored in the lower part of each of the two characters, with 0XD800 added to the first 16‑bit character and 0xDC00 added to the second. For example, the Unicode value 0x1F783 that represents an emoji is stored as 0xD83c and 0XDF83.
Jade now handles surrogate pair characters; in particular, the conversion from String to StringUtf8 and StringUtf8 to String primitive types. As a result of this, emoji characters can be included in text in a Jade system with some limitations.
The following code fragment is an example of Unicode surrogate pair character handling.
strUtf8 := #[f0 9f 8e 85].Binary.StringUtf8;// Father Christmas emoji str := strUtf8.String; write str; write strUtf8;
A Unicode UTF‑8 encoding table and emoji characters can be found at https://www.utf8-chartable.de/unicode-utf8-table.pl?start=127872.
The following is a list of notes about surrogate pair emoji usage in Jade.
-
Emojis can be used only in a Unicode Jade system because they cannot be represented in ANSI.
-
Emojis can be represented using StringUtf8 in an ANSI Jade system, but they cannot be converted to a String value as there is no ANSI representation.
-
Emojis can be included in any text displayed or printed, except for the rich text restriction later in this list.
-
Emojis can be copied and pasted to and from TextBox controls. The emoji selection window (displayed using the Windows key + period (.), or dot, character key combination) can be used to paste a selection.
-
Emoji characters can be converted to and from a StringUtf8 primitive type, which means that can be included in web text data.
-
An emoji character cannot be stored in a Character property because it requires two characters, so it will not fit.
-
Because an emoji cannot be stored in a Character-type property, using an arrow key can leave the cursor positioned in the same place and may require multiple presses to step over each part of the emoji.
-
Emojis can be made up of multiple surrogate pair values that are overlayed on each other to get the final displayed representation.
-
The Jade length of a string that includes emojis is the number of 16‑bit characters, not the number of displayed characters.
-
Locate emoji characters in text by checking whether the first character is greater than 0xD800 and the second character is greater than 0xDC00.
-
The JadeRichText control, which is a Microsoft control, does not support emoji characters.
-
Use of emoji characters in the Jade editor is not yet recommended. They can be inserted in methods and will display correctly. However, there are problems because the editor counts each surrogate pair as one displayed character, while Jade counts them as two. The positioning, therefore, of the cursor programmatically so that when the compiler shows the text that is in error or the debugger showing the current line may not be correct.
In addition, the editor pane does not support the selection of an emoji (that is, by pressing the Windows and period keys).