
<?xml version="1.0" encoding="UTF-8" ?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//ES" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="es" lang="es">
<head>
    <meta name="generator" content="HTML Tidy, see www.w3.org" />
    <meta name="VPSiteProject"
    content="file:///C|/U/MIRROR/TEMP/lojban-current/site/lojban.vpp" />

    <title>Lojban Reference Grammar: Chapter 4</title>

    <meta name="keywords" content="lojban lojban.org www.lojban.org constructed language linguistics reference grammar" />

    <!-- <body background="lalojbana2.jpg"> -->

    <style type="text/css">
    <!--
	body
	{
	    background-color: #ffffff; color: #000000;
	    font-family: verdana,arial,helvetica,sans-serif;
	}

	h1. h2, h3, h4, h5, h6
	{
	    font-family: verdana,arial,helvetica,sans-serif;
	}

	h1
	{
	    font-size: x-large;
	}

	h2
	{
	    font-size: large;
	}

	h3
	{
	    font-size: medium;
	}

	h4
	{
	    font-size: medium;
	}

	h5
	{
	    font-size: small;
	}

	h6
	{
	    font-size: x-small;
	}

	div.leftbar
	{
	    float: left;
	    width: 219px;
	    height: auto;
	    background: #c0c0c0;
	    color: black;
	    border-style: outset;
	    margin-right: 2%;
	    font-size: small;
	}

	div.rightbar
	{
	    float: right;
	    min-width: 90px;
	    width: 15%;
	    height: auto;
	    background: #c0c0c0;
	    color: black;
	    border-style: outset;
	    font-size: small;
	    top: 1%;
	    padding: 1%;
	}

	div.centered
	{
	    text-align: center;
	}

        div.main
        {
            padding-right: 1%;
	    padding-left: 2%;
        }

	li
	{
	    padding-right: 1%;
	    padding-left: 1%;
	    margin-left: 1%;
	}

	img.noborder
	{
	    text-align: center;
	    border-style: none;
	    border: 0;
	}

	hr
	{
	    width: 60%;
	}
    -->
    </style>

</head>

<body>

    <div class="leftbar">
	<h4>Reference Grammar Chapters</h4>

	<ol>
	    <li>
		<a
		href="chapter1.html">
		Lojban As We Mangle It In Lojbanistan: About This Book</a>
	    </li>
	
	    <li>
		<a
		href="chapter2.html">
		A Quick Tour of Lojban Grammar, With Diagrams</a>
	    </li>
	
	    <li>
		<a
		href="chapter3.html">
		The Hills Are Alive With The Sounds Of Lojban</a>
	    </li>
	
	    <li>
		<a
		href="chapter4.html">
		The Shape Of Words To Come: Lojban Morphology</a>
	    </li>
	
	    <li>
		<a
		href="chapter5.html">
		"Pretty Little Girls' School": The Structure Of Lojban
		selbri</a>
	    </li>
	
	    <li>
		<a
		href="chapter6.html">
		To Speak Of Many Things: The Lojban sumti</a>
	    </li>
	
	    <li>
		<a
		href="chapter7.html">
		Brevity Is The Soul Of Language: Pro-sumti And Pro-bridi</a>
	    </li>
	
	    <li>
		<a
		href="chapter8.html">
		Relative Clauses, Which Make sumti Even More Complicated</a>
	    </li>
	
	    <li>
		<a href="chapter9.html"> To Boston Via The Road
		Go I, With An Excursion Into The Land Of Modals</a>
	    </li>
	
	    <li>
		<a href="chapter10.html"> Imaginary Journeys: The
		Lojban Space/Time Tense System</a>
	    </li>
	
	    <li>
		<a href="chapter11.html"> Events, Qualities,
		Quantities, And Other Vague Words: On Lojban Abstraction</a>
	    </li>
	
	    <li>
		<a href="chapter12.html"> Dog House And White
		House: Determining lujvo Place Structures</a>
	    </li>
	
	    <li>
		<a href="chapter13.html"> Oooh! Arrgh! Ugh!
		Yecch! Attitudinal and Emotional Indicators</a>
	    </li>
	
	    <li>
		<a href="chapter14.html"> If Wishes Were Horses:
		The Lojban Connective System</a>
	    </li>
	
	    <li>
		<a href="chapter15.html"> "No" Problems: On
		Lojban Negation</a>
	    </li>
	
	    <li>
		<a href="chapter16.html"> "Who Did You Pass On
		The Road? Nobody": Lojban And Logic</a>
	    </li>
	
	    <li>
		<a href="chapter17.html"> As Easy As A-B-C? The
		Lojban Letteral System And Its Uses</a>
	    </li>
	
	    <li>
		<a href="chapter18.html"> lojbau mekso:
		Mathematical Expressions in Lojban</a>
	    </li>
	
	    <li>
		<a href="chapter19.html"> Putting It All
		Together: Notes on the Structure of Lojban Texts</a>
	    </li>
	
	    <li>
		<a href="chapter20.html"> A Catalogue of
		selma'o</a>
	    </li>
	
	    <li>
		<a href="chapter21.html">Formal Grammars</a>
	    </li>
	</ol>

    </div>

    <div class="main" >
    <div class="centered">
      <img src="chapter4.gif" alt="[Cartoon]"
      width="405" height="405" />

      <h2>Chapter 4<br />
      The Shape Of Words To Come: Lojban Morphology</h2>

      <!--
      <h6>$Revision: 4.1 $<br />
      mkhtml: 1.1</h6>
      -->
    </div>

    <h3><a id="s1" name="s1">1. Introductory</h3>

    <p>Morphology is the part of grammar that deals with the form
    of words. Lojban's morphology is fairly simple compared to that
    of many languages, because Lojban words don't change form
    depending on how they are used. English has only a small number
    of such changes compared to languages like Russian, but we do
    have changes like ``boys'' as the plural of ``boy'', or
    ``walked'' as the past-tense form of ``walk''. To make plurals
    or past tenses in Lojban, you add separate words to the
    sentence that express the number of boys, or the time when the
    walking was going on.</p>

    <p>However, Lojban does have what is called ``derivational
    morphology'': the capability of building new words from old
    words. In addition, the form of words tells us something about
    their grammatical uses, and sometimes about the means by which
    they entered the language. Lojban has very orderly rules for
    the formation of words of various types, both the words that
    already exist and new words yet to be created by speakers and
    writers.</p>

    <p>A stream of Lojban sounds can be uniquely broken up into its
    component words according to specific rules. These so-called
    ``morphology rules'' are summarized in this chapter. (However,
    a detailed algorithm for breaking sounds into words has not yet
    been fully debugged, and so is not presented in this book.)
    First, here are some conventions used to talk about groups of
    Lojban letters, including vowels and consonants.</p>

    <p></p>

    <dl>
      <dt>1)</dt>

      <dd>V represents any single Lojban vowel except ``y''; that
      is, it represents ``a'', ``e'', ``i'', ``o'', or ``u''.</dd>
    </dl>

    <dl>
      <dt>2)</dt>

      <dd>VV represents either a diphthong, one of the
      following:</dd>

      <dt></dt>

      <dd>ai ei oi au</dd>

      <dt></dt>

      <dd>or a two-syllable vowel pair with an apostrophe
      separating the vowels, one of the following:</dd>

      <dt></dt>

      <dd>a'a a'e a'i a'o a'u e'a e'e e'i e'o e'u i'a i'e i'i i'o
      i'u o'a o'e o'i o'o o'u u'a u'e u'i u'o u'u</dd>
    </dl>

    <dl>
      <dt>3)</dt>

      <dd>C represents a single Lojban consonant, not including the
      apostrophe, one of ``b'', ``c'', ``d'', ``f'', ``g'', ``j'',
      ``k'', ``l'', ``m'', ``n'', ``p'', ``r'', ``s'', ``t'',
      ``v'', ``x'', or ``z''. Syllabic ``l'', ``m'', ``n'', and
      ``r'' always count as consonants for the purposes of this
      chapter.</dd>
    </dl>

    <dl>
      <dt>4)</dt>

      <dd>CC represents two adjacent consonants of type C which
      constitute one of the 48 permissible initial consonant
      pairs:</dd>

      <dt></dt>

      <dd>bl br cf ck cl cm cn cp cr ct dj dr dz fl fr gl gr jb jd
      jg jm jv kl kr ml mr pl pr sf sk sl sm sn sp sr st tc tr ts
      vl vr xl xr zb zd zg zm zv</dd>
    </dl>

    <dl>
      <dt>5)</dt>

      <dd>C/C represents two adjacent consonants which constitute
      one of the permissible consonant pairs (not necessarily a
      permissible initial consonant pair). The permissible
      consonant pairs are explained in <a href="chapter3.html">Chapter
      3</a>. In brief, any consonant pair is permissible unless it
      contains: two identical letters, both a voiced (excluding
      ``r'', ``l'', ``m'', ``n'') and and an unvoiced consonant, or
      is one of certain specified forbidden pairs.</dd>
    </dl>

    <dl>
      <dt>6)</dt>

      <dd>C/CC represents a consonant triple. The first two
      consonants must constitute a permissible consonant pair; the
      last two consonants must constitute a permissible initial
      consonant pair.</dd>
    </dl>
    Lojban has three basic word classes --- parts of speech --- in
    contrast to the eight that are traditional in English. These
    three classes are called cmavo, brivla, and cmene. Each of
    these classes has uniquely identifying properties --- an
    arrangement of letters that allows the word to be uniquely and
    unambiguously recognized as a separate word in a string of
    Lojban, upon either reading or hearing, and as belonging to a
    specific word-class. 

    <p>They are also functionally different: cmavo are the
    structure words, corresponding to English words like ``and'',
    ``if'', ``the'' and ``to''; brivla are the content words,
    corresponding to English words like ``come'', ``red'',
    ``doctor'', and ``freely''; cmene are proper names,
    corresponding to English ``James'', ``Afghanistan'', and ``Pope
    John Paul II''.</p>

    <h3><a id="s2" name="s2">2. cmavo</h3>

    <p>The first group of Lojban words discussed in this chapter
    are the cmavo. They are the structure words that hold the
    Lojban language together. They often have no semantic meaning
    in themselves, though they may affect the semantics of brivla
    to which they are attached. The cmavo include the equivalent of
    English articles, conjunctions, prepositions, numbers, and
    punctuation marks. There are over a hundred subcategories of
    cmavo, known as ``selma'o'', each having a specifically defined
    grammatical usage. The various selma'o are discussed throughout
    <a href="chapter5.html">Chapters 5</a> to <a
    href="chapter19.html">19</a> and summarized in <a
    href="chapter20.html">Chapter 20</a>.</p>

    <p>Standard cmavo occur in four forms defined by their word
    structure. Here are some examples of the various forms:</p>
<pre>
   V-form      .a  .e  .i  .o  .u
    CV-form     ba  ce  di  fo  gu
    VV-form     .au .ei .ia .o'u    .u'e
    CVV-form    ki'a    pei mi'o    coi cu'u
</pre>

    <p>In addition, there is the cmavo ``.y.'' (remember that ``y''
    is not a V), which must have pauses before and after it.</p>

    <p>A simple cmavo thus has the property of having only one or
    two vowels, or of having a single consonant followed by one or
    two vowels. Words consisting of three or more vowels in a row,
    or a single consonant followed by three or more vowels, are
    also of cmavo form, but are reserved for experimental use: a
    few examples are ``ku'a'e'', ``sau'e'', and ``bai'ai''. All CVV
    cmavo beginning with the letter ``x'' are also reserved for
    experimental use. In general, though, the form of a cmavo tells
    you little or nothing about its grammatical use.</p>

    <p>``Experimental use'' means that the language designers will
    not assign any standard meaning or usage to these words, and
    words and usages coined by Lojban speakers will not appear in
    official dictionaries for the indefinite future.
    Experimental-use words provide an escape hatch for adding
    grammatical mechanisms (as opposed to semantic concepts) the
    need for which was not foreseen.</p>

    <p>The cmavo of VV-form include not only the diphthongs and
    vowel pairs listed in <a href="#s1">Section 1</a>, but also the
    following ten additional diphthongs:</p>
<pre>
   .ia .ie .ii .io .iu
    .ua .ue .ui .uo .uu
</pre>

    <p>In addition, cmavo can have the form ``Cy'', a consonant
    followed by the letter ``y''. These cmavo represent letters of
    the Lojban alphabet, and are discussed in detail in <a
    href="chapter17.html">Chapter 17</a>.</p>

    <p>Compound cmavo are sequences of cmavo attached together to
    form a single written word. A compound cmavo is always
    identical in meaning and in grammatical use to the separated
    sequence of simple cmavo from which it is composed. These words
    are written in compound form merely to save visual space, and
    to ease the reader's burden in identifying when the component
    cmavo are acting together.</p>

    <p>Compound cmavo, while not visually short like their
    components, can be readily identified by two
    characteristics:</p>

    <dl>
      <dt>1)</dt>

      <dd>They have no consonant pairs or clusters, and</dd>

      <dt>2)</dt>

      <dd>They end in a vowel.</dd>
    </dl>

    <p>For example:</p>
<pre>
<a id="e2d1" name="e2d1">2.1)</a>  .iseci'i
    .i se ci'i

<a id="e2d2" name="e2d2">2.2)</a>   punaijecanai
    pu nai je ca nai

<a id="e2d3" name="e2d3">2.3)</a>   ki'e.u'e
    ki'e .u'e
</pre>
    The cmavo ``.u'e'' begins with a vowel, and like all words
    beginning with a vowel, requires a pause (represented by ``.'')
    before it. This pause cannot be omitted simply because the
    cmavo is incorporated into a compound cmavo. On the other hand,
    
<pre>
<a id="e2d4" name="e2d4">2.4)</a>  ki'e'u'e
</pre>
    is a single cmavo reserved for experimental purposes: it has
    four vowels. 
<pre>
<a id="e2d5" name="e2d5">2.5)</a>  cy.ibu.abu
    cy. .ibu .abu
</pre>

    <p>Again the pauses are required (see <a href="#s9">Section
    9</a>); the pause after ``cy.'' merges with the pause before
    ``.ibu''.</p>

    <p>There is no particular stress required in cmavo or their
    compounds. Some conventions do exist that are not mandatory.
    For two-syllable cmavo, for example, stress is typically placed
    on the first vowel; an example is</p>
<pre>
<a id="e2d6" name="e2d6">2.6)</a>  .e'o ko ko kurji
    .E'o ko ko KURji
</pre>

    <p>This convention results in a consistent rhythm to the
    language, since brivla are required to have penultimate stress;
    some find this esthetically pleasing.</p>

    <p>If the final syllable of one word is stressed, and the first
    syllable of the next word is stressed, you must insert a pause
    or glottal stop between the two stressed syllables. Thus</p>
<pre>
<a id="e2d7" name="e2d7">2.7)</a>  le re nanmu
</pre>
    can be optionally pronounced 
<pre>
<a id="e2d8" name="e2d8">2.8)</a>  le RE. NANmu
</pre>
    since there are no rules forcing stress on either of the first
    two words; the stress on ``re'', though, demands that a pause
    separate ``re'' from the following syllable ``nan'' to ensure
    that the stress on ``nan'' is properly heard as a stressed
    syllable. The alternative pronunciation 
<pre>
<a id="e2d9" name="e2d9">2.9)</a>  LE re NANmu
</pre>
    is also valid; this would apply secondary stress (used for
    purposes of emphasis, contrast or sentence rhythm) to ``le'',
    comparable in rhythmical effect to the English phrase ``THE two
    men''. In <a href="#e2d8">Example 2.8</a>, the secondary stress
    on ``re'' would be similar to that in the English phrase ``the
    TWO men''. 

    <p>Both cmavo may also be left unstressed, thus:</p>
<pre>
<a id="e2d10" name="e2d10">2.10)</a>    le re NANmu
</pre>

    <p>This would probably be the most common usage.</p>

    <h3><a id="s3" name="s3">3. brivla</h3>

    <p>Predicate words, called ``brivla'', are at the core of
    Lojban. They carry most of the semantic information in the
    language. They serve as the equivalent of English nouns, verbs,
    adjectives, and adverbs, all in a single part of speech.</p>

    <p>Every brivla belongs to one of three major subtypes. These
    subtypes are defined by the form, or morphology, of the word
    --- all words of a particular structure can be assigned by
    sight or sound to a particular type (cmavo, brivla, or cmene)
    and subtype. Knowing the type and subtype then gives you, the
    reader or listener, significant clues to the meaning and the
    origin of the word, even if you have never heard the word
    before.</p>

    <p>The same principle allows you, when speaking or writing, to
    invent new brivla for new concepts ``on the fly''; yet it
    offers people that you are trying to communicate with a good
    chance to figure out your meaning. In this way, Lojban has a
    flexible vocabulary which can be expanded indefinitely.</p>

    <p>All brivla have the following properties:</p>

    <dl>
      <dt>1)</dt>

      <dd>always end in a vowel;</dd>

      <dt>2)</dt>

      <dd>always contain a consonant pair in the first five
      letters, where ``y'' and apostrophe are not counted as
      letters for this purpose. (See <a href="#s6">Section
      6</a>.)</dd>

      <dt>3)</dt>

      <dd>always are stressed on the next-to-the-last (penultimate)
      syllable; this implies that they have two or more
      syllables.</dd>
    </dl>
    The presence of a consonant pair distinguishes brivla from
    cmavo and their compounds. The final vowel distinguishes brivla
    from cmene, which always end in a consonant. Thus ``da'amei''
    must be a compound cmavo because it lacks a consonant pair;
    ``lojban.'' must be a name because it lacks a final vowel. 

    <p>Thus, ``bisycla'' has the consonant pair ``sc'' in the first
    five non-``y'' letters even though the ``sc'' actually appears
    in the form of ``syc''. Similarly, the word ``ro'inre'o''
    contains ``nr'' in the first five letters because the
    apostrophes are not counted for this purpose.</p>

    <p>The three subtypes of brivla are:</p>

    <p></p>

    <dl>
      <dt>1)</dt>

      <dd>gismu, the Lojban primitive roots from which all other
      brivla are built;</dd>
    </dl>

    <dl>
      <dt>2)</dt>

      <dd>lujvo, the compounds of two or more gismu; and</dd>
    </dl>
<pre>
3) fu'ivla (literally ``copy-word''), the specialized
    words that are not Lojban primitives or natural
    compounds, and are therefore borrowed from other
    languages.
</pre>

    <h3><a id="s4" name="s4">4. gismu</h3>

    <p>The gismu, or Lojban root words, are those brivla
    representing concepts most basic to the language. The gismu
    were chosen for various reasons: some represent concepts that
    are very familiar and basic; some represent concepts that are
    frequently used in other languages; some were added because
    they would be helpful in constructing more complex words; some
    because they represent fundamental Lojban concepts (like
    ``cmavo'' and ``gismu'' themselves).</p>

    <p>The gismu do not represent any sort of systematic
    partitioning of semantic space. Some gismu may be superfluous,
    or appear for historical reasons: the gismu list was being
    collected for almost 35 years and was only weeded out once.
    Instead, the intention is that the gismu blanket semantic
    space: they make it possible to talk about the entire range of
    human concerns.</p>

    <p>There are about 1350 gismu. In learning Lojban, you need
    only to learn most of these gismu and their combining forms
    (known as ``rafsi'') as well as perhaps 200 major cmavo, and
    you will be able to communicate effectively in the language.
    This may sound like a lot, but it is a small number compared to
    the vocabulary needed for similar communications in other
    languages.</p>

    <p>All gismu have very strong form restrictions. Using the
    conventions defined in <a href="#s1">Section 1</a>, all gismu
    are of the forms CVC/CV or CCVCV. They must meet the rules for
    all brivla given in <a href="#s3">Section 3</a>; furthermore,
    they:</p>

    <dl>
      <dt>1)</dt>

      <dd>always have five letters;</dd>

      <dt>2)</dt>

      <dd>always start with a consonant and end with a single
      vowel;</dd>

      <dt>3)</dt>

      <dd>always contain exactly one consonant pair, which is a
      permissible initial pair (CC) if it's at the beginning of the
      gismu, but otherwise only has to be a permissible pair
      (C/C);</dd>

      <dt>4)</dt>

      <dd>are always stressed on the first syllable (since that is
      penultimate).</dd>
    </dl>
    The five letter length distinguishes gismu from lujvo and
    fu'ivla. (It is possible to have fu'ivla like ``spa'i'' that
    are five letters long, but they must have ``'''; no gismu
    contains ``'''.) 

    <p>With the exception of five special brivla variables,
    ``broda'', ``brode'', ``brodi'', ``brodo'', and ``brodu'', no
    two gismu differ only in the final vowel. Furthermore, the set
    of gismu was specifically designed to reduce the likelihood
    that two similar sounding gismu could be confused. For example,
    because ``gismu'' is in the set of gismu, ``kismu'', ``xismu'',
    ``gicmu'', ``gizmu'', and ``gisnu'' cannot be.</p>

    <p>Almost all Lojban gismu are constructed from pieces of words
    drawn from other languages, specifically Chinese, English,
    Hindi, Spanish, Russian, and Arabic, the six most widely spoken
    natural languages. For a given concept, words in the six
    languages that represent that concept were written in Lojban
    phonetics. Then a gismu was selected to maximize the
    recognizability of the Lojban word for speakers of the six
    languages by weighting the inclusion of the sounds drawn from
    each language by the number of speakers of that language. See
    <a href="#s14">Section 14</a> for a full explanation of the
    algorithm.</p>

    <p>Here are a few examples of gismu, with rough English
    equivalents (not definitions):</p>
<pre>
<a id="e3d1" name="e3d1">3.1)</a>  creka
    shirt

<a id="e3d2" name="e3d2">3.2)</a>   lijda
    religion

<a id="e3d3" name="e3d3">3.3)</a>   blanu
    blue

<a id="e3d4" name="e3d4">3.4)</a>   mamta
    mother

<a id="e3d5" name="e3d5">3.5)</a>   cukta
    book

<a id="e3d6" name="e3d6">3.6)</a>   patfu
    father

<a id="e3d7" name="e3d7">3.7)</a>   nanmu
    man

<a id="e3d8" name="e3d8">3.8)</a>   ninmu
    woman
</pre>

    <p>A small number of gismu were formed differently; see <a
    href="#s15">Section 15</a> for a list.</p>

    <h3><a id="s5" name="s5">5. lujvo</h3>

    <p>When specifying a concept that is not found among the gismu
    (or, more specifically, when the relevant gismu seems too
    general in meaning), a Lojbanist generally attempts to express
    the concept as a tanru. Lojban tanru are an elaboration of the
    concept of ``metaphor'' used in English. In Lojban, any brivla
    can be used to modify another brivla. The first of the pair
    modifies the second. This modification is usually restrictive
    --- the modifying brivla reduces the broader sense of the
    modified brivla to form a more narrow, concrete, or specific
    concept. Modifying brivla may thus be seen as acting like
    English adverbs or adjectives. For example,</p>
<pre>
<a id="e5d1" name="e5d1">5.1)</a>  skami pilno
</pre>
    is the tanru which expresses the concept of ``computer user''. 

    <p>The simplest Lojban tanru are pairings of two concepts or
    ideas. Such tanru take two simpler ideas that can be
    represented by gismu and combine them into a single more
    complex idea. Two-part tanru may then be recombined in pairs
    with other tanru, or with individual gismu, to form more
    complex or more specific ideas, and so on.</p>

    <p>The meaning of a tanru is usually at least partly ambiguous:
    ``skami pilno'' could refer to a computer that is a user, or to
    a user of computers. There are a variety of ways that the
    modifier component can be related to the modified component. It
    is also possible to use cmavo within tanru to provide
    variations (or to prevent ambiguities) of meaning.</p>

    <p>Making tanru is essentially a poetic or creative act, not a
    science. While the syntax expressing the grouping relationships
    within tanru is unambiguous, tanru are still semantically
    ambiguous, since the rules defining the relationships between
    the gismu are flexible. The process of devising a new tanru is
    dealt with in detail in <a href="chapter5.html">Chapter 5</a>.</p>

    <p>To express a simple tanru, simply say the component gismu
    together. Thus the binary metaphor ``big boat'' becomes the
    tanru</p>
<pre>
<a id="e5d2" name="e5d2">5.2)</a>  barda bloti
</pre>
    representing roughly the same concept as the English word
    ``ship''. 

    <p>The binary metaphor ``father mother'' can refer to a
    paternal grandmother (``a father-ly type of mother''), while
    ``mother father'' can refer to a maternal grandfather (``a
    mother-ly type of father''). In Lojban, these become the
    tanru</p>
<pre>
<a id="e5d3" name="e5d3">5.3)</a>  patfu mamta
</pre>
    and 
<pre>
<a id="e5d4" name="e5d4">5.4)</a>  mamta patfu
</pre>
    respectively. 

    <p>The possibility of semantic ambiguity can easily be seen in
    the last case. To interpret <a href="#e5d4">Example 5.4</a>,
    the listener must determine what type of motherliness pertains
    to the father being referred to. In an appropriate context,
    ``mamta patfu'' could mean not ``grandfather'' but simply
    ``father with some motherly attributes'', depending on the
    culture. If absolute clarity is required, there are ways to
    expand upon and explain the exact interrelationship between the
    components; but such detail is usually not needed.</p>

    <p>When a concept expressed in a tanru proves useful, or is
    frequently expressed, it is desirable to choose one of the
    possible meanings of the tanru and assign it to a new brivla.
    For <a href="#e5d1">Example 5.1</a>, we would probably choose
    ``user of computers'', and form the new word</p>
<pre>
<a id="e5d5" name="e5d5">5.5)</a>  sampli
</pre>

    <p>Such a brivla, built from the rafsi which represent its
    component words, is called a ``lujvo''. Another example,
    corresponding to the tanru of <a href="#e5d2">Example 5.2</a>,
    would be:</p>
<pre>
<a id="e5d6" name="e5d6">5.6)</a>  bralo'i
    big-boat
    ship
</pre>
    The lujvo representing a given tanru is built from units
    representing the component gismu. These units are called
    ``rafsi'' in Lojban. Each rafsi represents only one gismu. The
    rafsi are attached together in the order of the words in the
    tanru, occasionally inserting so-called ``hyphen'' letters to
    ensure that the pieces stick together as a single word and
    cannot accidentally be broken apart into cmavo, gismu, or other
    word forms. As a result, each lujvo can be readily and
    accurately recognized, allowing a listener to pick out the word
    from a string of spoken Lojban, and if necessary, unambiguously
    decompose the word to a unique source tanru, thus providing a
    strong clue to its meaning. 

    <p>The lujvo that can be built from the tanru ``mamta patfu''
    in <a href="#e5d4">Example 5.4</a> is</p>
<pre>
<a id="e5d7" name="e5d7">5.7)</a>  mampa'u
</pre>
    which refers specifically to the concept ``maternal
    grandfather''. The two gismu that constitute the tanru are
    represented in ``mampa'u'' by the rafsi ``mam-'' and ``-pa'u'',
    respectively; these two rafsi are then concatenated together to
    form ``mampa'u''. 

    <p>Like gismu, lujvo have only one meaning. When a lujvo is
    formally entered into a dictionary of the language, a specific
    definition will be assigned based on one particular
    interrelationship between the terms. (See <a
    href="chapter12.html">Chapter 12</a> for how this has been done.)
    Unlike gismu, lujvo may have more than one form. This is
    because there is no difference in meaning between the various
    rafsi for a gismu when they are used to build a lujvo. A long
    rafsi may be used, especially in noisy environments, in place
    of a short rafsi; the result is considered the same lujvo, even
    though the word is spelled and pronounced differently. Thus the
    word ``brivla'', built from the tanru ``bridi valsi'', is the
    same lujvo as ``brivalsi'', ``bridyvla'', and ``bridyvalsi'',
    each of which uses a different combination of rafsi.</p>

    <p>When assembling rafsi together into lujvo, the rules for
    valid brivla must be followed: a consonant cluster must occur
    in the first five letters (excluding ``y'' and ``'''), and the
    lujvo must end in a vowel.</p>

    <p>A ``y'' (which is ignored in determining stress or consonant
    clusters) is inserted in the middle of the consonant cluster to
    glue the word together when the resulting cluster is either not
    permissible or the word is likely to break up. There are
    specific rules describing these conditions, detailed in <a
    href="#s6">Section 6</a>.</p>

    <p>An ``r'' (in some cases, an ``n'') is inserted when a
    CVV-form rafsi attaches to the beginning of a lujvo in such a
    way that there is no consonant cluster. For example, in the
    lujvo</p>

    <p></p>
<pre>
<a id="e5d8" name="e5d8">5.8)</a>  soirsai
    sonci sanmi
    soldier meal
    field rations
</pre>
    the rafsi ``soi-'' and ``-sai'' are joined, with the additional
    ``r'' making up the ``rs'' consonant pair needed to make the
    word a brivla. Without the ``r'', the word would break up into
    ``soi sai'', two cmavo. The pair of cmavo have no relation to
    their rafsi lookalikes; they will either be ungrammatical (as
    in this case), or will express a different meaning from what
    was intended. 

    <p>Learning rafsi and the rules for assembling them into lujvo
    is clearly seen to be necessary for fully using the potential
    Lojban vocabulary.</p>

    <p>Most important, it is possible to invent new lujvo while you
    speak or write in order to represent a new or unfamiliar
    concept, one for which you do not know any existing Lojban
    word. As long as you follow the rules for building these
    compounds, there is a good chance that you will be understood
    without explanation.</p>

    <h3><a id="s6" name="s6">6. rafsi</h3>

    <p>Every gismu has from two to five rafsi, each of a different
    form, but each such rafsi represents only one gismu. It is
    valid to use any of the rafsi forms in building lujvo ---
    whichever the reader or listener will most easily understand,
    or whichever is most pleasing --- subject to the rules of lujvo
    making. There is a scoring algorithm which is intended to
    determine which of the possible and legal lujvo forms will be
    the standard dictionary form (see <a href="#s12">Section
    12</a>).</p>

    <p>Each gismu always has at least two rafsi forms; one is the
    gismu itself (used only at the end of a lujvo), and one is the
    gismu without its final vowel (used only at the beginning or
    middle of a lujvo). These forms are represented as -CVC/CV or
    -CCVCV (called ``the 5-letter rafsi''), and -CVC/C- or -CCVC-
    (called ``the 4-letter rafsi'') respectively. The dashes in
    these rafsi form representations show where other rafsi may be
    attached to form a valid lujvo. When lujvo are formed only from
    4-letter and 5-letter rafsi, known collectively as ``long
    rafsi'', they are called ``unreduced lujvo''.</p>

    <p>Some examples of unreduced lujvo forms are:</p>
<pre>
<a id="e6d1" name="e6d1">6.1)</a>  mamtypatfu
    from ``mamta patfu''
    ``mother father'' or ``maternal grandfather''

<a id="e6d2" name="e6d2">6.2)</a>   lerfyliste
    from ``lerfu liste''
    ``letter list'' or a ``list of letters''
    (letters of the alphabet)

<a id="e6d3" name="e6d3">6.3)</a>   nancyprali
    from ``nanca prali''
    ``year profit'' or ``annual profit''

<a id="e6d4" name="e6d4">6.4)</a>   prunyplipe
    from ``pruni plipe''
    ``elastic (springy) leap'' or ``spring'' (the verb)
</pre>
<pre>
<a id="e6d5" name="e6d5">6.5)</a>  vancysanmi
    from ``vanci sanmi''
    ``evening meal'' or ``supper''
</pre>
    In addition to these two forms, each gismu may have up to three
    additional short rafsi, three letters long. All short rafsi
    have one of the forms -CVC-, -CCV-, or -CVV-. The total number
    of rafsi forms that are assigned to a gismu depends on how
    useful the gismu is, or is presumed to be, in making lujvo,
    when compared to other gismu that could be assigned the rafsi. 

    <p>For example, ``zmadu'' (``more than'') has the two short
    rafsi ``-zma-'' and ``-mau-'' (in addition to its unreduced
    rafsi ``-zmad-'' and ``-zmadu''), because a vast number of
    lujvo have been created based on ``zmadu'', corresponding in
    general to English comparative adjectives ending in ``-er''
    such as ``whiter'' (Lojban ``labmau''). On the other hand,
    ``bakri'' (``chalk'') has no short rafsi and few lujvo.</p>

    <p>There are at most one CVC-form, one CCV-form, and one
    CVV-form rafsi per gismu. In fact, only a tiny handful of gismu
    have both a CCV-form and a CVV-form rafsi assigned, and still
    fewer have all three forms of short rafsi. However, gismu with
    both a CVC-form and another short rafsi are fairly common,
    partly because more possible CVC-form rafsi exist. Yet CVC-form
    rafsi, even though they are fairly easy to remember, cannot be
    used at the end of a lujvo (because lujvo must end in vowels),
    so justifying the assignment of an additional short rafsi to
    many gismu.</p>

    <p>The intention was to use the available ``rafsi space'' ---
    the set of all possible short rafsi forms --- in the most
    efficient way possible; the goal is to make the most-used lujvo
    as short as possible (thus maximizing the use of short rafsi),
    while keeping the rafsi very recognizable to anyone who knows
    the source gismu. For this reason, the letters in a rafsi have
    always been chosen from among the five letters of the
    corresponding gismu. As a result, there are a limited set of
    short rafsi available for assignment to each gismu. At most
    seven possible short rafsi are available for consideration (of
    which at most three can be used, as explained above).</p>

    <p>Here are the only short rafsi forms that can possibly exist
    for gismu of the form CVC/CV, like ``sakli''. The digits in the
    second column represent the gismu letters used to form the
    rafsi.</p>
<pre>
   CVC     123     -sak-
    CVC     124     -sal-
    CVV     12'5        -sa'i-
    CVV     125     -sai-
    CCV     345     -kli-
    CCV     132     -ska-
</pre>
    (The only actual short rafsi for ``sakli'' is ``-sal-''.) 

    <p>For gismu of the form CCVCV, like ``blaci'', the only short
    rafsi forms that can exist are:</p>
<pre>
   CVC     134     -bac-
    CVC     234     -lac
    CVV     13'5        -ba'i-
    CVV     135     -bai-
    CVV     23'5        -la'i-
    CVV     235     -lai-
    CCV     123     -bla-
</pre>
    (In fact, ``blaci'' has none of these short rafsi; they are all
    assigned to other gismu. Lojban speakers are not free to
    reassign any of the rafsi; the tables shown here are to help
    understand how the rafsi were chosen in the first place.) 

    <p>There are a few restrictions: a CVV-form rafsi without an
    apostrophe cannot exist unless the vowels make up one of the
    four diphthongs ``ai'', ``ei'', ``oi'', or ``au''; and a
    CCV-form rafsi is possible only if the two consonants form a
    permissible initial consonant pair (see <a href="#s1">Section
    1</a>). Thus ``mamta'', which has the same form as ``salci'',
    can only have ``mam'', ``mat'', and ``ma'a'' as possible rafsi:
    in fact, only ``mam'' is assigned to it.</p>

    <p>Some cmavo also have associated rafsi, usually CVC-form. For
    example, the ten common numerical digits, which are all CV form
    cmavo, each have a CVC-form rafsi formed by adding a consonant
    to the cmavo. Most cmavo that have rafsi are ones used in
    composing tanru (for a complete list, see <a
    href="chapter12.html">Chapter 12</a>).</p>

    <p>The term for a lujvo made up solely of short rafsi is
    ``fully reduced lujvo''. Here are some examples of fully
    reduced lujvo:</p>
<pre>
<a id="e6d6" name="e6d6">6.6)</a>  cumfri
    from ``cumki lifri''
    ``possible experience''

<a id="e6d7" name="e6d7">6.7)</a>   klezba
    from ``klesi zbasu''
    ``category make''

<a id="e6d8" name="e6d8">6.8)</a>   kixta'a
    from ``krixa tavla''
    ``cry-out talk''

<a id="e6d9" name="e6d9">6.9)</a>   sniju'o
    from ``sinxa djuno''
    ``sign know''
</pre>

    <p>In addition, some of the unreduced forms in the previous
    example may be fully reduced to:</p>
<pre>
<a id="e6d10" name="e6d10">6.10)</a>    mampa'u
    from ``mamta patfu''
    ``mother father'' or ``maternal grandfather''

<a id="e6d11" name="e6d11">6.11)</a> lerste
    from ``lerfu liste''
    ``letter list'' or a ``list of letters''
</pre>
    As noted above, CVC-form rafsi cannot appear as the final rafsi
    in a lujvo, because all lujvo must end with one or two vowels.
    As a brivla, a lujvo must also contain a consonant cluster
    within the first five letters --- this ensures that they cannot
    be mistaken for compound cmavo. Of course, all lujvo have at
    least six letters since they have two or more rafsi, each at
    least three letters long; hence they cannot be confused with
    gismu. 

    <p>When attaching two rafsi together, it may be necessary to
    insert a hyphen letter. In Lojban, the term ``hyphen'' always
    refers to a letter, either the vowel ``y'' or one of the
    consonants ``r'' and ``n''. (The letter ``l'' can also be a
    hyphen, but is not used as one in lujvo.)</p>

    <p>The ``y''-hyphen is used after a CVC-form rafsi when joining
    it with the following rafsi could result in an impermissible
    consonant pair, or when the resulting lujvo could fall apart
    into two or more words (either cmavo or gismu).</p>

    <p>Thus, the tanru ``pante tavla'' (``protest talk'') cannot
    produce the lujvo ``patta'a'', because ``tt'' is not a
    permissible consonant pair; the lujvo must be ``patyta'a''.
    Similarly, the tanru ``mudri siclu'' (``wooden whistle'')
    cannot form the lujvo ``mudsiclu''; instead, ``mudysiclu'' must
    be used. (Remember that ``y'' is not counted in determining
    whether the first five letters of a brivla contain a consonant
    cluster: this is why.)</p>

    <p>The ``y''-hyphen is also used to attach a 4-letter rafsi,
    formed by dropping the final vowel of a gismu, to the following
    rafsi. (This procedure was shown, but not explained, in <a
    href="#e6d1">Examples 6.1</a> to <a href="#e6d5">6.5</a>.)
    The lujvo forms ``zunlyjamfu'', ``zunlyjma'', ``zuljamfu'', and
    ``zuljma'' are all legitimate and equivalent forms made from
    the tanru ``zunle jamfu'' (``left foot''). Of these, ``zuljma''
    is the preferred one since it is the shortest; it thus is
    likely to be the form listed in a Lojban dictionary.</p>

    <p>The ``r''-hyphen and its close relative, the ``n''-hyphen,
    are used in lujvo only after CVV-form rafsi. A hyphen is always
    required in a two-part lujvo of the form CVV-CVV, since
    otherwise there would be no consonant cluster.</p>

    <p>An ``r-''hyphen or ``n''-hyphen is also required after the
    CVV-form rafsi of any lujvo of the form CVV-CVC/CV or CVV-CCVCV
    since it would otherwise fall apart into a CVV-form cmavo and a
    gismu. In any lujvo with more than two parts, a CVV-form rafsi
    in the initial position must always be followed by a hyphen. If
    the hyphen were to be omitted, the supposed lujvo could be
    broken into smaller words without the hyphen: because the
    CVV-form rafsi would be interpreted as a cmavo, and the
    remainder of the word as a valid lujvo that is one rafsi
    shorter.</p>

    <p>An ``n''-hyphen is only used in place of an ``r''-hyphen
    when the following rafsi begins with ``r''. For example, the
    tanru ``rokci renro'' (``rock throw'') cannot be expressed as
    ``ro'ire'o'' (which breaks up into two cmavo), nor can it be
    ``ro'irre'o'' (which has an impermissible double consonant);
    the ``n''-hyphen is required, and the correct form of the
    hyphenated lujvo is ``ro'inre'o''. The same lujvo could also be
    expressed without hyphenation as ``rokre'o''.</p>

    <p>There is also a different way of building lujvo, or rather
    phrases which are grammatically and semantically equivalent to
    lujvo. You can make a phrase containing any desired words,
    joining each pair of them with the special cmavo ``zei''.
    Thus,</p>
<pre>
<a id="e6d12" name="e6d12">6.12)</a>    bridi zei valsi
</pre>
    is the exact equivalent of ``brivla'' (but not necessarily the
    same as the underlying tanru ``bridi valsi'', which could have
    other meanings.) Using ``zei'' is the only way to get a cmavo
    lacking a rafsi, a cmene, or a fu'ivla into a lujvo: 

    <p></p>
<pre>
<a id="e6d13" name="e6d13">6.13)</a>    xy. zei kantu
    X ray
</pre>
<pre>
<a id="e6d14" name="e6d14">6.14)</a>    kulnr,farsi zei lolgai
    Farsi floor-cover
    Persian rug
</pre>
<pre>
<a id="e6d15" name="e6d15">6.15)</a>    na'e zei .a zei na'e zei by. livgyterbilma
    non-A, non-B liver-disease
    non-A, non-B hepatitis
</pre>
<pre>
<a id="e6d16" name="e6d16">6.16)</a>    .cerman. zei xarnykarce
    Sherman war-car
    Sherman tank
</pre>
    <a href="#e6d15">Example 6.15</a> is particularly noteworthy
    because the phrase that would be produced by removing the
    ``zei''s from it doesn't end with a brivla, and in fact is not
    even grammatical. As written, the example is a tanru with two
    components, but by adding a ``zei'' between ``by.'' and
    ``livgyterbilma'' to produce 
<pre>
<a id="e6d17" name="e6d17">6.17)</a>    na'e zei .a zei na'e zei by. zei livgyterbilma
    non-A-non-B-hepatitis
</pre>
    the whole phrase would become a single lujvo. The longer lujvo
    of <a href="#e6d17">Example 6.17</a> may be preferable, because
    its place structure can be built from that of ``bilma'',
    whereas the place structure of a lujvo without a brivla must be
    constructed ad hoc. 

    <p>Note that rafsi may not be used in ``zei'' phrases, because
    they are not words. CVV rafsi look like words (specifically
    cmavo) but there can be no confusion between the two uses of
    the same letters, because cmavo appear only as separate words
    or in compound cmavo (which are really just a notation for
    writing separate but closely related words as if they were
    one); rafsi appear only as parts of lujvo.</p>

    <h3><a id="s7" name="s7">7. fu'ivla</h3>

    <p>The use of tanru or lujvo is not always appropriate for very
    concrete or specific terms (e.g. ``brie'' or ``cobra''), or for
    jargon words specialized to a narrow field (e.g. ``quark'',
    ``integral'', or ``iambic pentameter''). These words are in
    effect names for concepts, and the names were invented by
    speakers of another language. The vast majority of words
    referring to plants, animals, foods, and scientific terminology
    cannot be easily expressed as tanru. They thus must be borrowed
    (actually ``copied'') into Lojban from the original
    language.</p>

    <p>There are four stages of borrowing in Lojban, as words
    become more and more modified (but shorter and easier to use).
    Stage 1 is the use of a foreign name quoted with the cmavo
    ``la'o'' (explained in full in <a href="chapter19.html">Chapter
    19</a>):</p>

    <p></p>
<pre>
<a id="e7d1" name="e7d1">7.1)</a>  me la'o ly. spaghetti .ly.
</pre>
    is a predicate with the place structure ``x1 is a quantity of
    spaghetti''. 

    <p>Stage 2 involves changing the foreign name to a Lojbanized
    name, as explained in <a href="#s8">Section 8</a>:</p>
<pre>
<a id="e7d2" name="e7d2">7.2)</a>  me la spagetis.
</pre>

    <p>One of these expedients is often quite sufficient when you
    need a word quickly in conversation. (This can make it easier
    to get by when you do not yet have full command of the Lojban
    vocabulary, provided you are talking to someone who will
    recognize the borrowing.)</p>

    <p>Where a little more universality is desired, the word to be
    borrowed must be Lojbanized into one of several permitted
    forms. A rafsi is then usually attached to the beginning of the
    Lojbanized form, using a hyphen to ensure that the resulting
    word doesn't fall apart.</p>

    <p>The rafsi categorizes or limits the meaning of the fu'ivla;
    otherwise a word having several different jargon meanings in
    other languages would require the word-inventor to choose which
    meaning should be assigned to the fu'ivla, since fu'ivla (like
    other brivla) are not permitted to have more than one
    definition. Such a Stage 3 borrowing is the most common kind of
    fu'ivla.</p>

    <p>Finally, Stage 4 fu'ivla do not have any rafsi classifier,
    and are used where a fu'ivla has become so common or so
    important that it must be made as short as possible. (See <a
    href="#s16">Section 16</a> for a proposal concerning Stage 4
    fu'ivla.)</p>

    <p>The form of a fu'ivla reliably distinguishes it from both
    the gismu and the cmavo. Like cultural gismu, fu'ivla are
    generally based on a word from a single non-Lojban language.
    The word is ``borrowed'' (actually ``copied'', hence the Lojban
    tanru ``fukpi valsi'') from the other language and Lojbanized
    --- the phonemes are converted to their closest Lojban
    equivalent and modifications are made as necessary to make the
    word a legitimate Lojban fu'ivla-form word. All fu'ivla:</p>

    <p></p>

    <dl>
      <dt>1)</dt>

      <dd>must contain a consonant cluster in the first five
      letters of the word; if this consonant cluster is at the
      beginning, it must either be a permissible initial consonant
      pair, or a longer cluster such that each pair of adjacent
      consonants in the cluster is a permissible initial consonant
      pair: ``spraile'' is acceptable, but not ``ktraile'' or
      ``trkaile'';</dd>

      <dt>2)</dt>

      <dd>must end in one or more vowels;</dd>
    </dl>

    <dl>
      <dt>3)</dt>

      <dd>must not be gismu or lujvo, or any combination of cmavo,
      gismu, and lujvo; furthermore, a fu'ivla with a CV cmavo
      joined to the front of it must not have the form of a lujvo
      (the so-called ``slinku'i test'');</dd>
    </dl>

    <dl>
      <dt>4)</dt>

      <dd>cannot contain ``y'', although they may contain syllabic
      pronunciations of Lojban consonants;</dd>
    </dl>

    <dl>
      <dt>5)</dt>

      <dd>like other brivla, are stressed on the penultimate
      syllable.</dd>
    </dl>
    Note that consonant triples or larger clusters that are not at
    the beginning of a fu'ivla can be quite flexible, as long as
    all consonant pairs are permissible. There is no need to
    restrict fu'ivla clusters to permissible initial pairs except
    at the beginning. 

    <p>This is a fairly liberal definition and allows quite a lot
    of possibilities within ``fu'ivla space''. Stage 3 fu'ivla can
    be made easily on the fly, as lujvo can, because the procedure
    for forming them always guarantees a word that cannot violate
    any of the rules. Stage 4 fu'ivla require running tests that
    are not simple to characterize or perform, and should be made
    only after deliberation and by someone knowledgeable about all
    the considerations that apply.</p>

    <p>Here is a simple and reliable procedure for making a
    non-Lojban word into a valid Stage 3 fu'ivla:</p>

    <dl>
      <dt>1)</dt>

      <dd>Eliminate all double consonants and silent letters.</dd>

      <dt>2)</dt>

      <dd>Convert all sounds to their closest Lojban equivalents.
      Lojban ``y'', however, may not be used in any fu'ivla.</dd>

      <dt>3)</dt>

      <dd>If the last letter is not a vowel, modify the ending so
      that the word ends in a vowel, either by removing a final
      consonant or by adding a suggestively chosen final
      vowel.</dd>

      <dt>4)</dt>

      <dd>If the first letter is not a consonant, modify the
      beginning so that the word begins with a consonant, either by
      removing an initial vowel or adding a suggestively chosen
      initial consonant.</dd>
    </dl>

    <dl>
      <dt>5)</dt>

      <dd>Prefix the result of steps 1-5 with a 4-letter rafsi that
      categorizes the fu'ivla into a ``topic area''. It is only
      safe to use a 4-letter rafsi; short rafsi sometimes produce
      invalid fu'ivla. Hyphenate the rafsi to the rest of the
      fu'ivla with an ``r''-hyphen; if that would produce a double
      ``r'', use an ``n''-hyphen instead; if the rafsi ends in
      ``r'' and the rest of the fu'ivla begins with ``n'' (or vice
      versa) use an ``l''-hyphen. (This is the only use of
      ``l''-hyphen in Lojban.)</dd>

      <dt></dt>

      <dd>Alternatively, if a CVC-form short rafsi is available it
      can be used instead of the long rafsi.</dd>

      <dt>6)</dt>

      <dd>Remember that the stress necessarily appears on the
      penultimate (next-to-the-last) syllable.</dd>
    </dl>
    In this section, the hyphen is set off with commas in the
    examples, but these commas are not required in writing, and the
    hyphen need not be pronounced as a separate syllable. 

    <p>Here are a few examples:</p>

    <p></p>
<pre>
<a id="e7d3" name="e7d3">7.3)</a>  spaghetti (from English or Italian)
    spageti (Lojbanize)
    cidj,r,spageti (prefix long rafsi)
    dja,r,spageti (prefix short rafsi)
</pre>
    where ``cidj-'' is the 4-letter rafsi for ``cidja'', the Lojban
    gismu for ``food'', thus categorizing ``cidjrspageti'' as a
    kind of food. The form with the short rafsi happens to work,
    but such good fortune cannot be relied on: in any event, it
    means the same thing. 

    <p></p>
<pre>
<a id="e7d4" name="e7d4">7.4)</a>  Acer (the scientific name of maple trees)
    acer (Lojbanize)
    xaceru (add initial consonant and final vowel)
    tric,r,xaceru (prefix rafsi)
    ric,r,xaceru (prefix short rafsi)
</pre>
    where ``tric-'' and ``ric-'' are rafsi for ``tricu'', the gismu
    for ``tree''. Note that by the same principles, ``maple sugar''
    could get the fu'ivla ``saktrxaceru'', or could be represented
    by the tanru ``tricrxaceru sakta''. Technically, ``ricrxaceru''
    and ``tricrxaceru'' are distinct fu'ivla, but they would surely
    be given the same meanings if both happened to be in use. 

    <p></p>
<pre>
<a id="e7d5" name="e7d5">7.5)</a>  brie (from French)
    bri (Lojbanize)
    cirl,r,bri (prefix rafsi)
</pre>
    where ``cirl-'' represents ``cirla'' (``cheese''). 

    <p></p>
<pre>
<a id="e7d6" name="e7d6">7.6)</a>  cobra
    kobra (Lojbanize)
    sinc,r,kobra (prefix rafsi)
</pre>
    where ``sinc-'' represents ``since'' (``snake''). 

    <p></p>
<pre>
<a id="e7d7" name="e7d7">7.7)</a>  quark
    kuark (Lojbanize)
    kuarka (add final vowel)
    sask,r,kuarka (prefix rafsi)
</pre>
    where ``sask-'' represents ``saske'' (``science''). Note the
    extra vowel ``a'' added to the end of the word, and the
    diphthong ``ua'', which never appears in gismu or lujvo, but
    may appear in fu'ivla. 

    <p>The use of the prefix helps distinguish among the many
    possible meanings of the borrowed word, depending on the field.
    As it happens, ``spageti'' and ``kuarka'' are valid Stage 4
    fu'ivla, but ``xaceru'' looks like a compound cmavo, and
    ``kobra'' like a gismu.</p>

    <p>For another example, ``integral'' has a specific meaning to
    a mathematician. But the Lojban fu'ivla ``integrale'', which is
    a valid Stage 4 fu'ivla, does not convey that mathematical
    sense to a non-mathematical listener, even one with an
    English-speaking background; its source --- the English word
    ``integral'' --- has various other specialized meanings in
    other fields.</p>

    <p>Left uncontrolled, ``integrale'' almost certainly would
    eventually come to mean the same collection of loosely related
    concepts that English associates with ``integral'', with only
    the context to indicate (possibly) that the mathematical term
    is meant.</p>

    <p>The prefix method would render the mathematical concept as
    ``cmacrntegrale'', if the ``i'' of ``integrale'' is removed, or
    something like ``cmacrnintegrale'', if a new consonant is added
    to the beginning; ``cmac-'' is the rafsi for ``cmaci''
    (``mathematics''). The architectural sense of ``integral''
    might be conveyed with ``djinrnintegrale'' or
    ``tarmrnintegrale'', where ``dinju'' and ``tarmi'' mean
    ``building'' and ``form'' respectively.</p>

    <p>Here are some fu'ivla representing cultures and related
    things, shown with more than one rafsi prefix:</p>

    <p></p>
<pre>
<a id="e7d8" name="e7d8">7.8)</a>  bang,r,blgaria
    Bulgarian (in language)

<a id="e7d9" name="e7d9">7.9)</a>   kuln,r,blgaria
    Bulgarian (in culture)

<a id="e7d10" name="e7d10">7.10)</a> gugd,r,blgaria
    Bulgaria (the country)
</pre>
<pre>
<a id="e7d11" name="e7d11">7.11)</a>    bang,r,kore,a
    Korean (the language)

<a id="e7d12" name="e7d12">7.12)</a> kuln,r,kore,a
    Korean (the culture)
</pre>
    Note the commas in <a href="#e7d11">Examples 7.11</a> and <a
    href="#e7d12">7.12</a>, used because ``ea'' is not a valid
    diphthong in Lojban. Arguably, some form of the native name
    ``Chosen'' should have been used instead of the internationally
    known ``Korea''; this is a recurring problem in all borrowings.
    In general, it is better to use the native name unless using it
    will severely impede understanding: ``Navajo'' is far more
    widely known than ``Dine'e''. 

    <h3><a id="s8" name="s8">8. cmene</h3>

    <p>Lojbanized names, called ``cmene'', are very much like their
    counterparts in other languages. They are labels applied to
    things (or people) to stand for them in descriptions or in
    direct address. They may convey meaning in themselves, but do
    not necessarily do so.</p>

    <p>Because names are often highly personal and individual,
    Lojban attempts to allow native language names to be used with
    a minimum of modification. The requirement that the Lojban
    speech stream be unambiguously analyzable, however, means that
    most names must be modified somewhat when they are Lojbanized.
    Here are a few examples of English names and possible Lojban
    equivalents:</p>

    <p></p>
<pre>
<a id="e8d1" name="e8d1">8.1)</a>  djim.
    Jim
</pre>
<pre>
<a id="e8d2" name="e8d2">8.2)</a>  djein.
    Jane
</pre>
<pre>
<a id="e8d3" name="e8d3">8.3)</a>  .arnold.
    Arnold
</pre>
<pre>
<a id="e8d4" name="e8d4">8.4)</a>  pit.
    Pete
</pre>
<pre>
<a id="e8d5" name="e8d5">8.5)</a>  katrinas.
    Katrina
</pre>
<pre>
<a id="e8d6" name="e8d6">8.6)</a>  kat,r,in.
    Catherine
</pre>
    (Note that syllabic ``r'' is skipped in determining the
    stressed syllable, so <a href="#e8d6">Example 8.6</a> is
    stressed on the ``ka''.) 

    <p></p>
<pre>
<a id="e8d7" name="e8d7">8.7)</a>  katis.
    Cathy
</pre>
<pre>
<a id="e8d8" name="e8d8">8.8)</a>  keit.
    Kate
</pre>
    Names may have almost any form, but always end in a consonant,
    and are followed by a pause. They are penultimately stressed,
    unless unusual stress is marked with capitalization. A name may
    have multiple parts, each ending with a consonant and pause, or
    the parts may be combined into a single word with no pause. For
    example, 

    <p></p>
<pre>
<a id="e8d9" name="e8d9">8.9)</a>  djan.  djonz.
</pre>
    and 
<pre>
<a id="e8d10" name="e8d10">8.10)</a>    djandjonz.
</pre>
    are both valid Lojbanizations of ``John Jones''. 

    <p>The final arbiter of the correct form of a name is the
    person doing the naming, although most cultures grant people
    the right to determine how they want their own name to be
    spelled and pronounced. The English name ``Mary'' can thus be
    Lojbanized as ``meris.'', ``maris.'', ``meiris.'', ``merix.'',
    or even ``marys.''. The last alternative is not pronounced much
    like its English equivalent, but may be desirable to someone
    who values spelling over pronunciation. The final consonant
    need not be an ``s''; there must, however, be some Lojban
    consonant at the end.</p>

    <p>Names are not permitted to have the sequences ``la'',
    ``lai'', or ``doi'' embedded in them, unless the sequence is
    immediately preceded by a consonant. These minor restrictions
    are due to the fact that all Lojban cmene embedded in a speech
    stream will be preceded by one of these words or by a pause.
    With one of these words embedded, the cmene might break up into
    valid Lojban words followed by a shorter cmene. However,
    break-up cannot happen after a consonant, because that would
    imply that the word before the ``la'', or whatever, ended in a
    consonant without pause, which is impossible.</p>

    <p>For example, the invalid name ``laplas.'' would look like
    the Lojban words ``la plas.'', and ``ilanas.'' would be
    misunderstood as ``.i la nas.''. However, ``nederlants.''
    cannot be misheard as ``neder lants.'', because ``neder'' with
    no following pause is not a possible Lojban word.</p>

    <p>There are close alternatives to these forbidden sequences
    that can be used in Lojbanizing names, such as ``ly'', ``lei'',
    and ``dai'' or ``do'i'', that do not cause these problems.</p>

    <p>Lojban cmene are identifiable as word forms by the following
    characteristics:</p>

    <p></p>

    <dl>
      <dt>1)</dt>

      <dd>They must end in one or more consonants. There are no
      rules about how many consonants may appear in a cluster in
      cmene, provided that each consonant pair (whether standing by
      itself, or as part of a larger cluster) is a permissible
      pair.</dd>
    </dl>

    <dl>
      <dt>2)</dt>

      <dd>They may contain the letter y as a normal,
      non-hyphenating vowel. They are the only kind of Lojban word
      that may contain the two diphthongs ``iy'' and ``uy''.</dd>
    </dl>

    <dl>
      <dt>3)</dt>

      <dd>They are always followed in speech by a pause after the
      final consonant, written as ``.''.</dd>
    </dl>

    <dl>
      <dt>4)</dt>

      <dd>They may be stressed on any syllable; if this syllable is
      not the penultimate one, it must be capitalized when writing.
      Neither names nor words that begin sentences are capitalized
      in Lojban, so this is the only use of capital letters.</dd>
    </dl>
    Names meeting these criteria may be invented, Lojbanized from
    names in other languages, or formed by appending a consonant
    onto a cmavo, a gismu, a fu'ivla or a lujvo. Some cmene built
    from Lojban words are: 

    <p></p>
<pre>
<a id="e8d11" name="e8d11">8.11)</a>    pav.
    the One
    from the cmavo ``pa'', with rafsi ``pav'', meaning ``one''
</pre>
<pre>
<a id="e8d12" name="e8d12">8.12)</a>    sol.
    the Sun
    from the gismu ``solri'', meaning ``solar'', or actually
        ``pertaining to the Sun''
</pre>
<pre>
<a id="e8d13" name="e8d13">8.13)</a>    ralj.
    Chief (as a title)
    from the gismu ``ralju'', meaning ``principal''.
</pre>
<pre>
<a id="e8d14" name="e8d14">8.14)</a>    nol.
    Lord/Lady
    from the gismu ``nobli'', with rafsi ``nol'',
        meaning ``noble''.
</pre>
    To Lojbanize a name from the various natural languages, apply
    the following rules: 

    <dl>
      <dt>1)</dt>

      <dd>Eliminate double consonants and silent letters.</dd>

      <dt>2)</dt>

      <dd>Add a final ``s'' or ``n'' (or some other consonant that
      sounds good) if the name ends in a vowel.</dd>

      <dt>3)</dt>

      <dd>Convert all sounds to their closest Lojban
      equivalents.</dd>

      <dt>4)</dt>

      <dd>If possible and acceptable, shift the stress to the
      penultimate (next-to-the-last) syllable. Use commas and
      capitalization in written Lojban when it is necessary to
      preserve non-standard syllabication or stress. Do not
      capitalize names otherwise.</dd>
    </dl>

    <dl>
      <dt>5)</dt>

      <dd>If the name contains an impermissible consonant pair,
      insert a vowel between the consonants: ``y'' is
      recommended.</dd>
    </dl>

    <dl>
      <dt>6)</dt>

      <dd>No cmene may have the syllables ``la'', ``lai'', or
      ``doi'' in them, unless immediately preceded by a consonant.
      If these combinations are present, they must be converted to
      something else. Possible substitutions include ``ly'',
      ``ly'i'', and ``dai'' or ``do'i'', respectively.</dd>
    </dl>
    There are some additional rules for Lojbanizing the scientific
    names (technically known as ``Linnaean binomials'' after their
    inventor) which are internationally applied to each species of
    animal or plant. Where precision is essential, these names need
    not be Lojbanized, but can be directly inserted into Lojban
    text using the cmavo ``la'o'', explained in <a
    href="chapter19.html">Chapter 19</a>. Using this cmavo makes the
    already lengthy Latinized names at least four syllables longer,
    however, and leaves the pronunciation in doubt. The following
    suggestions, though incomplete, will assist in converting
    Linnaean binomals to valid Lojban names. They can also help to
    create fu'ivla based on Linnaean binomials or other words of
    the international scientific vocabulary. The term ``back
    vowel'' in the following list refers to any of the letters
    ``a'', ``o'', or ``u''; the term ``front vowel''
    correspondingly refers to any of the letters ``e'', ``i'', or
    ``y''. 

    <dl>
      <dt>1)</dt>

      <dd>Change double consonants other than ``cc'' to single
      consonants.</dd>

      <dt>2)</dt>

      <dd>Change ``cc'' before a front vowel to ``kc'', but
      otherwise to ``k''.</dd>

      <dt>3)</dt>

      <dd>Change ``c'' before a back vowel and final ``c'' to
      ``k''.</dd>

      <dt>4)</dt>

      <dd>Change ``ng'' before a consonant (other than ``h'') and
      final ``ng'' to ``n''.</dd>

      <dt>5)</dt>

      <dd>Change ``x'' to ``z'' initially, but otherwise to
      ``ks''.</dd>

      <dt>6)</dt>

      <dd>Change ``pn'' to ``n'' initially.</dd>

      <dt>7)</dt>

      <dd>Change final ``ie'' and ``ii'' to ``i''.</dd>

      <dt>8)</dt>

      <dd>Make the following idiosyncratic substitutions:</dd>

      <dt></dt>

      <dd>aa a ae e ch k ee i eigh ei ew u igh ai oo u ou u ow au
      ph f q k sc sk w u y i</dd>

      <dt></dt>

      <dd>However, the diphthong substitutions should not be done
      if the two vowels are in two different syllables.</dd>

      <dt>9)</dt>

      <dd>Change ``h'' between two vowels to ``''', but otherwise
      remove it completely. If preservation of the ``h'' seems
      essential, change it to ``x'' instead.</dd>

      <dt>10)</dt>

      <dd>Place ``''' between any remaining vowel pairs that do not
      form Lojban diphthongs.</dd>

      <dt></dt>

      <dd>Some further examples of Lojbanized names are:</dd>
    </dl>
<pre>
   English ``Mary''        meris.
        or      meiris.
    English ``Smith''       smit.
    English ``Jones''       djonz.
    English ``John''        djan. or jan. (American)
                  or djon. or jon. (British)
    English ``Alice''       .alis.
    English ``Elise''       .eLIS.
    English ``Johnson'' djansn.
    English ``William'' .uiliam.
                  or .uil,iam.
    English ``Brown''       braun.
    English ``Charles'' tcarlz.
    French ``Charles''  carl.
    French ``De Gaulle''    dyGOL.
    German ``Heinrich'' xainrix.
    Spanish ``Joaquin'' xuaKIN.
    Russian ``Svetlana''    sfietlanys.
    Russian ``Khrushchev''  xrucTCOF.
    Hindi ``Krishna''       kricnas.
    Polish ``Lech Walesa''  lex. va,uensas.
    Spanish ``Don Quixote'' don. kicotes.
            or modern Spanish: don. kixotes.
            or Mexican dialect: don. ki'otes.
    Chinese ``Mao Zedong''  maudzydyn.
    Japanese ``Fujiko'' fudjikos.
                  or fujikos.
</pre>

    <h3><a id="s9" name="s9">9. Rules for inserting pauses</h3>

    <p>Summarized in one place, here are the rules for inserting
    pauses between Lojban words:</p>

    <p></p>

    <dl>
      <dt>1)</dt>

      <dd>Any two words may have a pause between them; it is always
      illegal to pause in the middle of a word, because that breaks
      up the word into two words.</dd>
    </dl>

    <dl>
      <dt>2)</dt>

      <dd>Every word ending in a consonant must be followed by a
      pause. Necessarily, all such words are cmene.</dd>
    </dl>

    <dl>
      <dt>3)</dt>

      <dd>Every word beginning with a vowel must be preceded by a
      pause. Such words are either cmavo, fu'ivla, or cmene; all
      gismu and lujvo begin with consonants.</dd>
    </dl>

    <dl>
      <dt>4)</dt>

      <dd>Every cmene must be preceded by a pause, unless the
      immediately preceding word is one of the cmavo ``la'',
      ``lai'', ``la'i'', or ``doi'' (which is why those strings are
      forbidden in cmene). However, the situation triggering this
      rule rarely occurs.</dd>
    </dl>

    <dl>
      <dt>5)</dt>

      <dd>If the last syllable of a word bears the stress, and a
      brivla follows, the two must be separated by a pause, to
      prevent confusion with the primary stress of the brivla. In
      this case, the first word must be either a cmavo or a cmene
      with unusual stress (which already ends with a pause, of
      course).</dd>
    </dl>

    <dl>
      <dt>6)</dt>

      <dd>A cmavo of the form ``Cy'' must be followed by a pause
      unless another ``Cy''-form cmavo follows.</dd>
    </dl>

    <dl>
      <dt>7)</dt>

      <dd>When non-Lojban text is embedded in Lojban, it must be
      preceded and followed by pauses. (How to embed non-Lojban
      text is explained in <a href="chapter19.html">Chapter
      19</a>.)</dd>
    </dl>

    <h3><a id="s10" name="s10">10. Considerations for making lujvo</h3>

    <p>Given a tanru which expresses an idea to be used frequently,
    it can be turned into a lujvo by following the lujvo-making
    algorithm which is given in <a href="#s11">Section 11</a>.</p>

    <p>In building a lujvo, the first step is to replace each gismu
    with a rafsi that uniquely represents that gismu. These rafsi
    are then attached together by fixed rules that allow the
    resulting compound to be recognized as a single word and to be
    analyzed in only one way.</p>

    <p>There are three other complications; only one is
    serious.</p>

    <p>The first is that there is usually more than one rafsi that
    can be used for each gismu. The one to be used is simply
    whichever one sounds or looks best to the speaker or writer.
    There are usually many valid combinations of possible rafsi.
    They all are equally valid, and all of them mean exactly the
    same thing. (The scoring algorithm given in <a
    href="#s12">Section 12</a> is used to choose the standard form
    of the lujvo --- the version which would be entered into a
    dictionary.)</p>

    <p>The second complication is the serious one. Remember that a
    tanru is ambiguous --- it has several possible meanings. A
    lujvo, or at least one that would be put into the dictionary,
    has just a single meaning. Like a gismu, a lujvo is a predicate
    which encompasses one area of the semantic universe, with one
    set of places. Hopefully the meaning chosen is the most useful
    of the possible semantic spaces. A possible source of
    linguistic drift in Lojban is that as Lojbanic society evolves,
    the concept that seems the most useful one may change.</p>

    <p>You must also be aware of the possibility of some prior
    meaning of a new lujvo, especially if you are writing for
    posterity. If a lujvo is invented which involves the same tanru
    as one that is in the dictionary, and is assigned a different
    meaning (or even just a different place structure), linguistic
    drift results. This isn't necessarily bad. Every natural
    language does it. But in communication, when you use a meaning
    different from the dictionary definition, someone else may use
    the dictionary and therefore misunderstand you. You can use the
    cmavo ``za'e'' (explained in <a href="chapter19.html">Chapter
    19</a>) before a newly coined lujvo to indicate that it may
    have a non-dictionary meaning.</p>

    <p>The essential nature of human communication is that if the
    listener understands, then all is well. Let this be the
    ultimate guideline for choosing meanings and place structures
    for invented lujvo.</p>

    <p>The third complication is also simple, but tends to scare
    new Lojbanists with its implications. It is based on Zipf's
    Law, which says that the length of words is inversely
    proportional to their usage. The shortest words are those which
    are used more; the longest ones are used less. Conversely,
    commonly used concepts will be tend to be abbreviated. In
    English, we have abbreviations and acronyms and jargon, all of
    which represent complex ideas that are used often by small
    groups of people, so they shortened them to convey more
    information more rapidly.</p>

    <p>Therefore, given a complicated tanru with grouping markers,
    abstraction markers, and other cmavo in it to make it
    syntactically unambiguous, the psychological basis of Zipf's
    Law may compel the lujvo-maker to drop some of the cmavo to
    make a shorter (technically incorrect) tanru, and then use that
    tanru to make the lujvo.</p>

    <p>This doesn't lead to ambiguity, as it might seem to. A given
    lujvo still has exactly one meaning and place structure. It is
    just that more than one tanru is competing for the same lujvo.
    But more than one meaning for the tanru was already competing
    for the ``right'' to define the meaning of the lujvo. Someone
    has to use judgment in deciding which one meaning is to be
    chosen over the others.</p>

    <p>If the lujvo made by a shorter form of tanru is in use, or
    is likely to be useful for another meaning, the decider then
    retains one or more of the cmavo, preferably ones that set this
    meaning apart from the shorter form meaning that is used or
    anticipated. As a rule, therefore, the shorter lujvo will be
    used for a more general concept, possibly even instead of a
    more frequent word. If both words are needed, the simpler one
    should be shorter. It is easier to add a cmavo to clarify the
    meaning of the more complex term than it is to find a good
    alternate tanru for the simpler term.</p>

    <p>And of course, we have to consider the listener. On hearing
    an unknown word, the listener will decompose it and get a tanru
    that makes no sense or the wrong sense for the context. If the
    listener realizes that the grouping operators may have been
    dropped out, he or she may try alternate groupings, or try
    inserting an abstraction operator if that seems plausible. (The
    grouping of tanru is explained in <a href="chapter5.html">Chapter
    5</a>; abstraction is explained in <a
    href="chapter11.html">Chapter 11</a>.) Plausibility is the key to
    learning new ideas and to evaluating unfamiliar lujvo.</p>

    <h3><a id="s11" name="s11">11. The lujvo-making algorithm</h3>

    <p>The following is the current algorithm for generating Lojban
    lujvo given a known tanru and a complete list of gismu and
    their assigned rafsi. The algorithm was designed by Bob
    LeChevalier and Dr. James Cooke Brown for computer program
    implementation. It was modified in 1989 with the assistance of
    Nora LeChevalier, who detected a flaw in the original
    ``tosmabru test''.</p>

    <p>Given a tanru that is to be made into a lujvo:</p>

    <dl>
      <dt>1)</dt>

      <dd>Choose a 3-letter or 4-letter rafsi for each of the gismu
      and cmavo in the tanru except the last.</dd>

      <dt>2)</dt>

      <dd>Choose a 3-letter (CVV-form or CCV-form) or 5-letter
      rafsi for the final gismu in the tanru.</dd>

      <dt>3)</dt>

      <dd>Join the resulting string of rafsi, initially without
      hyphens.</dd>
    </dl>

    <dl>
      <dt>4)</dt>

      <dd>Add hyphen letters where necessary. It is illegal to add
      a hyphen at a place that is not required by this algorithm.
      Right-to-left tests are recommended, for reasons discussed
      below.</dd>

      <dt>4a)</dt>

      <dd>If there are more than two words in the tanru, put an
      ``r''-hyphen (or an ``n''-hyphen) after the first rafsi if it
      is CVV-form. If there are exactly two words, then put an
      ``r''-hyphen (or an ``n''-hyphen) between the two rafsi if
      the first rafsi is CVV-form, unless the second rafsi is
      CCV-form (for example, ``saicli'' requires no hyphen). Use an
      ``r''-hyphen unless the letter after the hyphen is ``r'', in
      which case use an ``n''-hyphen. Never use an ``n''-hyphen
      unless it is required.</dd>

      <dt>4b)</dt>

      <dd>Put a ``y''-hyphen between the consonants of any
      impermissible consonant pair. This will always appear between
      rafsi.</dd>

      <dt>4c)</dt>

      <dd>Put a ``y''-hyphen after any 4-letter rafsi form.</dd>
    </dl>

    <dl>
      <dt>5)</dt>

      <dd>Test all forms with one or more initial CVC-form rafsi
      --- with the pattern ``CVC ... CVC + X'' --- for ``tosmabru
      failure''. X must either be a CVCCV long rafsi that happens
      to have a permissible initial pair as the consonant cluster,
      or is something which has caused a ``y''-hyphen to be
      installed between the previous CVC and itself by one of the
      above rules. The test is as follows:</dd>

      <dt>5a)</dt>

      <dd>Examine all the C/C consonant pairs that join the CVC
      rafsi, and also the pair between the last CVC and the X
      portion, ignoring any ``y''-hyphen before the X. These
      consonant pairs are called ``joints''.</dd>

      <dt>5b)</dt>

      <dd>If all of those joints are permissible initials, then the
      trial word will break up into a cmavo and a shorter brivla.
      If not, the word will not break up, and no further hyphens
      are needed.</dd>

      <dt>5c)</dt>

      <dd>Install a ``y''-hyphen at the first such joint.</dd>
    </dl>

    <p>Note that the ``tosmabru test'' implies that the algorithm
    will be more efficient if rafsi junctures are tested for
    required hyphens from right to left, instead of from left to
    right; when the test is required, it cannot be completed until
    hyphenation to the right has been determined.</p>

    <h3><a id="s12" name="s12">12. The lujvo scoring algorithm</h3>

    <p>This algorithm was devised by Bob and Nora LeChevalier in
    1989. It is not the only possible algorithm, but it usually
    gives a choice that people find preferable. The algorithm may
    be changed in the future. The lowest-scoring variant will
    usually be the dictionary form of the lujvo. (In previous
    versions, it was the highest-scoring variant.)</p>

    <dl>
      <dt>1)</dt>

      <dd>Count the total number of letters, including hyphens and
      apostrophes; call it ``L''.</dd>

      <dt>2)</dt>

      <dd>Count the number of apostrophes; call it ``A''.</dd>

      <dt>3)</dt>

      <dd>Count the number of ``y''-, ``r''-, and ``n''-hyphens;
      call it ``H''.</dd>

      <dt>4)</dt>

      <dd>For each rafsi, find the value in the following table.
      Sum this value over all rafsi; call it ``R'':</dd>

      <dt></dt>

      <dd>CVC/CV (final) (-sarji) 1 CVC/C (-sarj-) 2 CCVCV (final)
      (-zbasu) 3 CCVC (-zbas-) 4 CVC (-nun-) 5 CVV with an
      apostrophe (-ta'u-) 6 CCV (-zba-) 7 CVV with no apostrophe
      (-sai-) 8</dd>

      <dt>5)</dt>

      <dd>Count the number of vowels, not including ``y''; call it
      ``V''.</dd>
    </dl>

    <p>The score is then:</p>

    <dl>
      <dt></dt>

      <dd>(1000 * L) - (500 * A) + (100 * H) - (10 * R) - V</dd>
    </dl>
    In case of ties, there is no preference. This should be rare.
    Note that the algorithm essentially encodes a hierarchy of
    priorities: short words are preferred (counting apostrophes as
    half a letter), then words with fewer hyphens, words with more
    pleasing rafsi (this judgment is subjective), and finally words
    with more vowels are chosen. Each decision principle is applied
    in turn if the ones before it have failed to choose; it is
    possible that a lower-ranked principle might dominate a
    higher-ranked one if it is ten times better than the
    alternative. 

    <p>Here are some lujvo with their scores (not necessarily the
    lowest scoring forms for these lujvo, nor even necessarily
    sensible lujvo):</p>
<pre>
<a id="e12d1" name="e12d1">12.1)</a>    zbasai
    zba + sai
    (1000 * 6) - (500 * 0)
        + (100 * 0) - (10 * 15) - 3 = 5847

<a id="e12d2" name="e12d2">12.2)</a> nunynau
    nun + y + nau
    32500 - (1000 * 7) + (500 * 0)
        - (100 * 1) + (10 * 13) + 3 = 6967

<a id="e12d3" name="e12d3">12.3)</a> sairzbata'u
    sai + r + zba + ta'u
    32500 - (1000 * 11) + (500 * 1)
        - (100 * 1) + (10 * 21) + 5 = 10385

<a id="e12d4" name="e12d4">12.4)</a> zbazbasysarji
    zba + zbas + y + sarji  
    32500 - (1000 * 13) + (500 * 0)
        - (100 * 1) + (10 * 12) + 4 = 12976
</pre>

    <h3><a id="s13" name="s13">13. lujvo-making examples</h3>

    <p>This section contains examples of making and scoring lujvo.
    First, we will start with the tanru ``gerku zdani'' (``dog
    house'') and construct a lujvo meaning ``doghouse'', that is, a
    house where a dog lives. We will use a brute-force application
    of the algorithm in <a href="#s12">Section 12</a>, using every
    possible rafsi.</p>

    <p>The rafsi for ``gerku'' are:</p>

    <dl>
      <dt></dt>

      <dd>-ger-, -ge'u-, -gerk-, -gerku</dd>
    </dl>

    <p>The rafsi for ``zdani'' are:</p>

    <dl>
      <dt></dt>

      <dd>-zda-, -zdan-, -zdani.</dd>
    </dl>

    <p>Step 1 of the algorithm directs us to use ``-ger-'',
    ``-ge'u-'' and ``-gerk-'' as possible rafsi for ``gerku''; Step
    2 directs us to use ``-zda-'' and ``-zdani'' as possible rafsi
    for ``zdani''. The six possible forms of the lujvo are
    then:</p>

    <dl>
      <dt></dt>

      <dd>ger-zda ger-zdani ge'u-zda ge'u-zdani gerk-zda
      gerk-zdani</dd>
    </dl>

    <p>We must then insert appropriate hyphens in each case. The
    first two forms need no hyphenation: ``ge'' cannot fall off the
    front, because the following word would begin with ``rz'',
    which is not a permissible initial consonant pair. So the lujvo
    forms are ``gerzda'' and ``gerzdani''.</p>

    <p>The third form, ``ge'u-zda'', needs no hyphen, because even
    though the first rafsi is CVV, the second one is CCV, so there
    is a consonant cluster in the first five letters. So
    ``ge'uzda'' is this form of the lujvo.</p>

    <p>The fourth form, ``ge'u-zdani'', however, requires an
    ``r''-hyphen; otherwise, the ``ge'u-'' part would fall off as a
    cmavo. So this form of the lujvo is ``ge'urzdani''.</p>

    <p>The last two forms require ``y''-hyphens, as all 4-letter
    rafsi do, and so are ``gerkyzda'' and ``gerkyzdani''
    respectively.</p>

    <p>The scoring algorithm is heavily weighted in favor of short
    lujvo, so we might expect that ``gerzda'' would win. Its L
    score is 6, its A score is 0, its H score is 0, its R score is
    12, and its V score is 3, for a final score of 5878. The other
    forms have scores of 7917, 6367, 9506, 8008, and 10047
    respectively. Consequently, this lujvo would probably appear in
    the dictionary in the form ``gerzda''.</p>

    <p>For the next example, we will use the tanru ``bloti klesi''
    (``boat class'') presumably referring to the category (rowboat,
    motorboat, cruise liner) into which a boat falls. We will omit
    the long rafsi from the process, since lujvo containing long
    rafsi are almost never preferred by the scoring algorithm when
    there are short rafsi available.</p>

    <p>The rafsi for ``bloti'' are ``-lot-'', ``-blo-'', and
    ``-lo'i-''; for ``klesi'' they are ``-kle-'' and ``-lei-''.
    Both these gismu are among the handful which have both CVV-form
    and CCV-form rafsi, so there is an unusual number of
    possibilities available for a two-part tanru:</p>
<pre>
   lotkle      blokle      lo'ikle
    lotlei      blolei      lo'irlei
</pre>

    <p>Only ``lo'irlei'' requires hyphenation (to avoid confusion
    with the cmavo sequence ``lo'i lei''). All six forms are valid
    versions of the lujvo, as are the six further forms using long
    rafsi; however, the scoring algorithm produces the following
    results:</p>
<pre>
   lotkle 5878 blokle 5858 lo'ikle 6367
    lotlei 5867 blolei 5847 lo'irlei 7456
</pre>

    <p>So the form ``blolei'' is preferred, but only by a tiny
    margin over ``blokle''; the next two forms are only slightly
    worse; ``lo'ikle'' suffers because of its apostrophe, and
    ``lo'irlei'' because of having both apostrophe and hyphen.</p>

    <p>Our third example will result in forming both a lujvo and a
    name from the tanru ``logji bangu girzu'', or
    ``logical-language group'' in English. (``The Logical Language
    Group'' is the name of the publisher of this book and the
    organization for the promotion of Lojban.) The available rafsi
    are ``-loj-'' and ``-logj-''; ``-ban-'', ``-bau-'', and
    ``-bang-''; and ``-gri-'' and ``-girzu'', and (for name
    purposes only) ``-gir-'' and ``-girz-''. The resulting 12 lujvo
    possibilities are:</p>
<pre>
   loj-ban-gri loj-bau-gri loj-bang-gri
    logj-ban-gri    logj-bau-gri    logj-bang-gri
    loj-ban-girzu   loj-bau-girzu   loj-bang-girzu
    logj-ban-girzu  logj-bau-girzu  logj-bang-girzu
</pre>
    and the 12 name possibilities are: 
<pre>
   loj-ban-gir.    loj-bau-gir.    loj-bang-gir.
    logj-ban-gir.   logj-bau-gir.   logj-bang-gir.
    loj-ban-girz.   loj-bau-girz.   loj-bang-girz.
    logj-ban-girz.  logj-bau-girz.  logj-bang-girz.
</pre>

    <p>After hyphenation, we have:</p>
<pre>
   lojbangri   lojbaugri   lojbangygri
    logjybangri logjybaugri logjybangygri
    lojbangirzu lojbaugirzu lojbangygirzu
    logjybangirzu   logjybaugirzu   logjybangygirzu

    lojbangir.  lojbaugir.  lojbangygir.
    logjybangir.    logjybaugir.    logjybangygir.
    lojbangirz. lojbaugirz. lojbangygirz.
    logjybangirz.   logjybaugirz.   logjybangygirz.
</pre>

    <p>The only fully reduced lujvo forms are ``lojbangri'' and
    ``lojbaugri'', of which the latter has a slightly lower score:
    8827 versus 8796, respectively. However, for the name of the
    organization, we chose to make sure the name of the language
    was embedded in it, and to use the clearer long-form rafsi for
    ``girzu'', producing ``lojbangirz.''</p>

    <p>Finally, here is a four-part lujvo with a cmavo in it, based
    on the tanru ``nakni ke cinse ctuca'' or ``male (sexual
    teacher)''. The ``ke'' cmavo ensures the interpretation
    ``teacher of sexuality who is male'', rather than ``teacher of
    male sexuality''. Here are the possible forms of the lujvo,
    both before and after hyphenation:</p>
<pre>
   nak-kem-cin-ctu     nakykemcinctu
    nak-kem-cin-ctuca   nakykemcinctuca
    nak-kem-cins-ctu    nakykemcinsyctu
    nak-kem-cins-ctuca  nakykemcinsyctuca
    nakn-kem-cin-ctu    naknykemcinctu
    nakn-kem-cin-ctuca  naknykemcinctuca
    nakn-kem-cins-ctu   naknykemcinsyctu
    nakn-kem-cins-ctuca naknykemcinsyctuca
</pre>

    <p>Of these forms, ``nakykemcinctu'' is the shortest and is
    preferred by the scoring algorithm. On the whole, however, it
    might be better to just make a lujvo for ``cinse ctuca'' (which
    would be ``cinctu'') since the sex of the teacher is rarely
    important. If there was a reason to specify ``male'', then the
    simpler tanru ``nakni cinctu'' (``male sexual-teacher'') would
    be appropriate. This tanru is actually shorter than the
    four-part lujvo, since the ``ke'' required for grouping need
    not be expressed.</p>

    <h3><a id="s14" name="s14">14. The gismu creation algorithm</h3>

    <p>The gismu were created through the following process:</p>

    <p></p>

    <dl>
      <dt>1)</dt>

      <dd>At least one word was found in each of the six source
      languages (Chinese, English, Hindi, Spanish, Russian, Arabic)
      corresponding to the proposed gismu. This word was rendered
      into Lojban phonetics rather liberally: consonant clusters
      consisting of a stop and the corresponding fricative were
      simplified to just the fricative (``tc'' became ``c'', ``dj''
      became ``j'') and non-Lojban vowels were mapped onto Lojban
      ones. Furthermore, morphological endings were dropped. The
      same mapping rules were applied to all six languages for the
      sake of consistency.</dd>

      <dt>2)</dt>

      <dd>All possible gismu forms were matched against the six
      source-language forms. The matches were scored as
      follows:</dd>
    </dl>

    <dl>
      <dt>2a)</dt>

      <dd>If three or more letters were the same in the proposed
      gismu and the source-language word, and appeared in the same
      order, the score was equal to the number of letters that were
      the same. Intervening letters, if any, did not matter.</dd>

      <dt>2b)</dt>

      <dd>If exactly two letters were the same in the proposed
      gismu and the source-language word, and either the two
      letters were consecutive in both words, or were separated by
      a single letter in both words, the score was 2. Letters in
      reversed order got no score.</dd>

      <dt>2c)</dt>

      <dd>Otherwise, the score was 0.</dd>

      <dt>3)</dt>

      <dd>The scores were divided by the length of the
      source-language word in its Lojbanized form, and then
      multiplied by a weighting value specific to each language,
      reflecting the proportional number of first-language and
      second-language speakers of the language. (Second-language
      speakers were reckoned at half their actual numbers.) The
      weights were chosen to sum to 1.00. The sum of the weighted
      scores was the total score for the proposed gismu form.</dd>
    </dl>

    <dl>
      <dt>4)</dt>

      <dd>Any gismu forms that conflicted with existing gismu were
      removed. Obviously, being identical with an existing gismu
      constitutes a conflict. In addition, a proposed gismu that
      was identical to an existing gismu except for the final vowel
      was considered a conflict, since two such gismu would have
      identical 4-letter rafsi.</dd>
    </dl>

    <dl>
      <dt></dt>

      <dd>More subtly: If the proposed gismu was identical to an
      existing gismu except for a single consonant, and the
      consonant was ``too similar'' based on the following table,
      then the proposed gismu was rejected.</dd>

      <dt></dt>

      <dd>proposed gismu existing gismu</dd>

      <dt></dt>

      <dd>b p, v c j, s d t f p, v g k, x j c, z k g, x l r m n n m
      p b, f r l s c, z t d v b, f x g, k z j, s</dd>
    </dl>

    <p>See <a href="#s4">Section 4</a> for an example.</p>

    <p></p>

    <dl>
      <dt>5)</dt>

      <dd>The gismu form with the highest score usually became the
      actual gismu. Sometimes a lower-scoring form was used to
      provide a better rafsi. A few gismu were changed in error as
      a result of transcription blunders (for example, the gismu
      ``gismu'' should have been ``gicmu'', but it's too late to
      fix it now).</dd>
    </dl>
    The language weights used to make most of the gismu were as
    follows: 
<pre>
   Chinese     0.36
    English     0.21
    Hindi       0.16
    Spanish     0.11
    Russian     0.09
    Arabic      0.07
</pre>
    reflecting 1985 number-of-speakers data. A few gismu were made
    much later 

    <dl>
      <dt>using updated weights:</dt>

      <dt></dt>

      <dd>Chinese 0.347 Hindi 0.196 English 0.160 Spanish 0.123
      Russian 0.089 Arabic 0.085</dd>
    </dl>
    (English and Hindi switched places due to demographic changes.)
    

    <p>Note that the stressed vowel of the gismu was considered
    sufficiently distinctive that two or more gismu may differ only
    in this vowel; as an extreme example, ``bradi'', ``bredi'',
    ``bridi'', and ``brodi'' (but fortunately not ``brudi'') are
    all existing gismu.</p>

    <h3><a id="s15" name="s15">15. Cultural and other non-algorithmic gismu</h3>

    <p>The following gismu were not made by the gismu creation
    algorithm. They are, in effect, coined words similar to
    fu'ivla. They are exceptions to the otherwise mandatory gismu
    creation algorithm where there was sufficient justification for
    such exceptions. Except for the small metric prefixes and the
    assignable predicates beginning with ``brod-'', they all end in
    the letter ``o'', which is otherwise a rare letter in Lojban
    gismu.</p>

    <p>The following gismu represent concepts that are sufficiently
    unique to Lojban that they were either coined from combining
    forms of other gismu, or else made up out of whole cloth. These
    gismu are thus conceptually similar to lujvo even though they
    are only five letters long; however, unlike lujvo, they have
    rafsi assigned to them for use in building more complex lujvo.
    Assigning gismu to these concepts helps to keep the resulting
    lujvo reasonably short.</p>
<pre>
   broda       1st assignable predicate
    brode       2nd assignable predicate
    brodi       3rd assignable predicate
    brodo       4th assignable predicate
    brodu       5th assignable predicate
    cmavo       structure word (from ``cmalu valsi'')
    lojbo       Lojbanic (from ``logji bangu'')
    lujvo       compound word (from ``pluja valsi'')
    mekso       Mathematical EXpression
</pre>

    <p>It is important to understand that even though ``cmavo'',
    ``lojbo'', and ``lujvo'' were made up from parts of other
    gismu, they are now full-fledged gismu used in exactly the same
    way as all other gismu, both in grammar and in word
    formation.</p>

    <p>The following three groups of gismu represent concepts drawn
    from the international language of science and mathematics.
    They are used for concepts that are represented in most
    languages by a root which is recognized internationally.</p>

    <p>Small metric prefixes (less than 1):</p>
<pre>
   decti       .1/deci
    centi       .01/centi   
    milti       .001/milli
    mikri       1E-6/micro
    nanvi       1E-9/nano
    picti       1E-12/pico
    femti       1E-15/femto 
    xatsi       1E-18/atto
    zepti       1E-21/zepto
    gocti       1E-24/yocto
</pre>

    <p>Large metric prefixes (greater than 1):</p>
<pre>
   dekto       10/deka 
    xecto       100/hecto
    kilto       1000/kilo   
    megdo       1E6/mega    
    gigdo       1E9/giga
    terto       1E12/tera
    petso       1E15/peta
    xexso       1E18/exa
    zetro       1E21/zetta
    gotro       1E24/yotta
</pre>

    <p>Other scientific or mathematical terms:</p>
<pre>
   delno       candela 
    kelvo       kelvin
    molro       mole
    radno       radian
    sinso       sine
    stero       steradian
    tanjo       tangent
    xampo       ampere
</pre>

    <p>The gismu ``sinso'' and ``tanjo'' were only made
    non-algorithmically because they were identical (having been
    borrowed from a common source) in all the dictionaries that had
    translations. The other terms in this group are units in the
    international metric system; some metric units, however, were
    made by the ordinary process (usually because they are
    different in Chinese).</p>

    <p>Finally, there are the cultural gismu, which are also
    borrowed, but by modifying a word from one particular language,
    instead of using the multi-lingual gismu creation algorithm.
    Cultural gismu are used for words that have local importance to
    a particular culture; other cultures or languages may have no
    word for the concept at all, or may borrow the word from its
    home culture, just as Lojban does. In such a case, the gismu
    algorithm, which uses weighted averages, doesn't accurately
    represent the frequency of usage of the individual concept.
    Cultural gismu are not even required to be based on the six
    major languages.</p>

    <p>The six Lojban source languages:</p>
<pre>
   jungo       Chinese (from ``Zhong<sup>1</sup> guo<sup>2</sup>'')
    glico       English
    xindo       Hindi
    spano       Spanish
    rusko       Russian
    xrabo       Arabic
</pre>

    <p>Seven other widely spoken languages that were on the list of
    candidates for gismu-making, but weren't used:</p>
<pre>
   bengo       Bengali
    porto       Portuguese
    baxso       Bahasa Melayu/Bahasa Indonesia
    ponjo       Japanese (from ``Nippon'')
    dotco       German (from ``Deutsch'')
    fraso       French (from ``Fran&ccedil;ais'')
    xurdo       Urdu
</pre>
    (Urdu and Hindi began as the same language with different
    writing systems, but have now become somewhat different
    principally in borrowed vocabulary. Urdu-speakers were counted
    along with Hindi-speakers when weights were assigned for
    gismu-making purposes.) 

    <p>Countries with a large number of speakers of any of the
    above languages (where the meaning of ``large'' is dependent on
    the specific language):</p>

    <dl>
      <dt></dt>

      <dd>English: merko American brito British skoto Scottish
      sralo Australian kadno Canadian</dd>

      <dt></dt>

      <dd>Spanish: gento Argentinian mexno Mexican</dd>

      <dt></dt>

      <dd>Russian: softo Soviet/USSR vukro Ukrainian</dd>

      <dt></dt>

      <dd>Arabic: filso Palestinian jerxo Algerian jordo Jordanian
      libjo Libyan lubno Lebanese misro Egyptian (from ``Mizraim'')
      morko Moroccan rakso Iraqi sadjo Saudi sirxo Syrian</dd>

      <dt></dt>

      <dd>Bahasa Melayu/Bahasa Indonesia: bindo Indonesian meljo
      Malaysian</dd>

      <dt></dt>

      <dd>Portuguese: brazo Brazilian</dd>

      <dt></dt>

      <dd>Urdu: kisto Pakistani</dd>
    </dl>
    The continents (and oceanic regions) of the Earth: 
<pre>
   bemro       North American (from ``berti merko'')
    dzipo       Antarctican (from ``cadzu cipni'')
    ketco       South American (from ``Quechua'')
    friko       African
    polno       Polynesian/Oceanic
    ropno       European
    xazdo       Asiatic
</pre>
    A few smaller but historically important cultures: 
<pre>
   latmo       Latin/Roman
    srito       Sanskrit
    xebro       Hebrew/Israeli
    xelso       Greek (from ``Hellas'')
</pre>
    Major world religions: 
<pre>
   budjo       Buddhist
    dadjo       Taoist
    muslo       Islamic/Moslem
    xriso       Christian
</pre>

    <p>A few terms that cover multiple groups of the above:</p>
<pre>
   jegvo       Jehovist (Judeo-Christian-Moslem)
    semto       Semitic
    slovo       Slavic
    xispo       Hispanic (New World Spanish)
</pre>

    <h3><a id="s16" name="s16">16. rafsi fu'ivla: a proposal</h3>

    <p>The list of cultures represented by gismu, given in <a
    href="#s15">Section 15</a>, is unavoidably controversial. Much
    time has been spent debating whether this or that culture
    ``deserves a gismu'' or ``must languish in fu'ivla space''. To
    help defuse this argument, a last-minute proposal was made when
    this book was already substantially complete. I have added it
    here with experimental status: it is not yet a standard part of
    Lojban, since all its implications have not been tested in open
    debate, and it affects a part of the language (lujvo-making)
    that has long been stable, but is known to be fragile in the
    face of small changes. (Many attempts were made to add general
    mechanisms for making lujvo that contained fu'ivla, but all
    failed on obvious or obscure counterexamples; finally the
    general ``zei'' mechanism was devised instead.)</p>

    <p>The first part of the proposal is uncontroversial and
    involves no change to the language mechanisms. All valid Type 4
    fu'ivla of the form CCVVCV would be reserved for cultural
    brivla analogous to those described in <a href="#s15">Section
    15</a>. For example,</p>
<pre>
<a id="e16d1" name="e16d1">16.1)</a>    tci'ile
    Chilean
</pre>
    is of the appropriate form, and passes all tests required of a
    Stage 4 fu'ivla. No two fu'ivla of this form would be allowed
    to coexist if they differed only in the final vowel; this rule
    was applied to gismu, but does not apply to other fu'ivla or to
    lujvo. 

    <p>The second, and fully experimental, part of the proposal is
    to allow rafsi to be formed from these cultural fu'ivla by
    removing the final vowel and treating the result as a 4-letter
    rafsi (although it would contain five letters, not four). These
    rafsi could then be used on a par with all other rafsi in
    forming lujvo. The tanru</p>

    <p></p>
<pre>
<a id="e16d2" name="e16d2">16.2)</a>    tci'ile ke canre tutra
    Chilean type-of (sand territory)
    Chilean desert
</pre>
    could be represented by the lujvo 
<pre>
<a id="e16d3" name="e16d3">16.3)</a>    tci'ilykemcantutra
</pre>
    which is an illegal word in standard Lojban, but a valid lujvo
    under this proposal. There would be no short rafsi or 5-letter
    rafsi assigned to any fu'ivla, so no fu'ivla could appear as
    the last element of a lujvo. 

    <p>The cultural fu'ivla introduced under this proposal are
    called ``rafsi fu'ivla'', since they are distinguished from
    other Type 4 fu'ivla by the property of having rafsi. If this
    proposal is workable and introduces no problems into Lojban
    morphology, it might become standard for all Type 4 fu'ivla,
    including those made for plants, animals, foodstuffs, and other
    things.</p>

    <hr />
    
    <p>Última modificación: Mon Jun 27 23:11:01 PDT 2005</p>

    <p>Please <a href="../../llg/feedback.php">contact us</a> with any comments, suggestions or concerns.</p>

</body>
</html>
