We have presented the design and implementation of multilingualization (``m17n'') effort of Squeak programming system.
The newly added character and string representation are used to hold the extended characters. An object in this new representation is implicitly converted from/to the 8-bit character and string one if possible. The character set for the default 8-bit characters were changed to the Latin-1 and the codes for the extended character sets roughly follows the Unicode definition with encoding tags attached.
The keyboard input is interpreted by the
. If the
multiple octets need to be combined to make a character, the sensor
generates the combined multi-octet character accordingly.
Extended text composition routine handles the different composition layout rules. The encoding tag for a character is used to switch the actual implementation of the scanner method to be called. To handle a case where a text contains a sequence of characters that represents a single visual representation, a separated visual presentation text is created.
The glyphs for the entire code space are broken down into separated
fonts. Such a font covers a script in the Unicode definition. A font
is represented as a
object, and the set of fonts that
share the same height and family are grouped as an instance of a class
called
. Again, the actual font in a
is selected based on the encoding tag of a character.
The SqueakToys system is now capable to handle the extended character sets. Because the original language switch mechanism doesn't provide all necessary translation for the end users. We modified the menu and string morph creation sites so that the arguments for them are translated and provide about 1500 word mapping rules.
The resulting software has been used by many users. We believe that we are on the right track toward the our goal, which is to provide the collaborative environment for everyone over the network.