17.5. Alien alphabets

As stated in Section 17.1, Lojban's goal of cultural neutrality demands a standard set of lerfu words for the lerfu of as many other writing systems as possible. When we meet these lerfu in written text (particularly, though not exclusively, mathematical text), we need a standard Lojbanic way to pronounce them.

There are certainly hundreds of alphabets and other writing systems in use around the world, and it is probably an unachievable goal to create a single system which can express all of them, but if perfection is not demanded, a usable system can be created from the raw material which Lojban provides.

One possibility would be to use the lerfu word associated with the language itself, Lojbanized and with bu added. Indeed, an isolated Greek alpha in running Lojban text is probably most easily handled by calling it .alfas. bu. Here the Greek lerfu word has been made into a Lojbanized name by adding s and then into a Lojban lerfu word by adding bu. Note that the pause after .alfas. is still needed.

Likewise, the easiest way to handle the Latin letters h, q, and w that are not used in Lojban is by a consonant lerfu word with bu attached. The following assignments have been made:

.y'y.bu

h

ky.bu

q

vy.bu

w

As an example, the English word quack would be spelled in Lojban thus:

Example 17.11. 

ky.bu .ubu .abu cy. ky.
q u a c k

Note that the fact that the letter c in this word has nothing to do with the sound of the Lojban letter c is irrelevant; we are spelling an English word and English rules control the choice of letters, but we are speaking Lojban and Lojban rules control the pronunciations of those letters.

A few more possibilities for Latin-alphabet letters used in languages other than English:

ty.bu

þ (thorn)

dy.bu

ð (edh)

However, this system is not ideal for all purposes. For one thing, it is verbose. The native lerfu words are often quite long, and with bu added they become even longer: the worst-case Greek lerfu word would be .Omikron. bu, with four syllables and two mandatory pauses. In addition, alphabets that are used by many languages have separate sets of lerfu words for each language, and which set is Lojban to choose?

The alternative plan, therefore, is to use a shift word similar to those introduced in Section 17.3. After the appearance of such a shift word, the regular lerfu words are re-interpreted to represent the lerfu of the alphabet now in use. After a shift to the Greek alphabet, for example, the lerfu word ty would represent not Latin t but Greek tau. Why tau? Because it is, in some sense, the closest counterpart of t within the Greek lerfu system. In principle it would be all right to map ty. to phi or even omega, but such an arbitrary relationship would be extremely hard to remember.

Where no obvious closest counterpart exists, some more or less arbitrary choice must be made. Some alien lerfu may simply not have any shifted equivalent, forcing the speaker to fall back on a bu form. Since a bu form may mean different things in different alphabets, it is safest to employ a shift word even when bu forms are in use.

Shifts for several alphabets have been assigned cmavo of selma'o BY:

lo'a

Latin/Roman/Lojban alphabet

ge'o

Greek alphabet

je'o

Hebrew alphabet

jo'o

Arabic alphabet

ru'o

Cyrillic alphabet

The cmavo zai (of selma'o LAU) is used to create shift words to still other alphabets. The BY word which must follow any LAU cmavo would typically be a name representing the alphabet with bu suffixed:

Example 17.12. 

zai .devanagar. bu

Devanagari (Hindi) alphabet


Example 17.13. 

zai .katakan. bu

Japanese katakana syllabary


Example 17.14. 

zai .xiragan. bu

Japanese hiragana syllabary


Unlike the cmavo above, these shift words have not been standardized and probably will not be until someone actually has a need for them. (Note the . characters marking leading and following pauses.)

In addition, there may be multiple visible representations within a single alphabet for a given letter: roman vs. italics, handwriting vs. print, Bodoni vs. Helvetica. These traditional font and face distinctions are also represented by shift words, indicated with the cmavo ce'a (of selma'o LAU) and a following BY word:

Example 17.15. 

ce'a .xelveticas. bu

Helvetica font


Example 17.16. 

ce'a .xancisk. bu

handwriting


Example 17.17. 

ce'a .pavrel. bu

12-point font size


The cmavo na'a (of selma'o BY) is a universal shift-word cancel: it returns the interpretation of lerfu words to the default of lower-case Lojban with no specific font. It is more general than lo'a, which changes the alphabet only, potentially leaving font and case shifts in place.

Several sections at the end of this chapter contain tables of proposed lerfu word assignments for various languages.