Grammar

Lojban's grammar is defined by a set of rules that have been tested to be unambiguous using computers. Grammatical unambiguity means that in a grammatical expression, each word has exactly one grammatical interpretation, and that within the expression the words relate grammatically to each other in exactly one way. (By comparison, in the English Time flies like an arrow, each of the first three words has at least two grammatical meanings, and each possible combination results in a different grammatical structure for the sentence.)

The machine grammar is the set of computer-tested rules that describes, and is the standard for, 'correct Lojban'. If a Lojban speaker follows those rules exactly, the expression will be grammatically unambiguous. If the rules are not followed, ambiguity may exist. Ambiguity does not make communication impossible, of course. Every speaker on Earth speaks an ambiguous language. But Lojbanists strive for accuracy in Lojban grammatical usage, and thereby for grammatically unambiguous communication. (Semantic ambiguity, as we have seen, is another matter.)

It is important to note that new Lojbanists will not be able to speak 'perfectly' when first learning Lojban. In fact, you may never speak perfectly in 'natural' Lojban conversation, even though you achieve fluency in the language. No English speaker always speaks textbook English in natural conversation; Lojban speakers will also make grammatical errors when talking quickly. Lojbanists will, however, be able to speak or write unambiguously if they are careful, which is difficult if not impossible with a natural language.

In Lojban grammar rules, words are assembled into short phrases representing a possible piece of a Lojban expression. These phrases are then assembled into longer phrases, and so forth, until all possible pieces have been incorporated through rules that describe all possible expressions in the language. Lojban's rules include grammar for 'incomplete' sentences, for multiple sentences flowing together in a narrative, for quotation, and for mathematical expressions.

The grammar is very simple, but infinitely powerful; often, a more complex phrase can be placed inside a simple structure, which in turn can be used in another instance of the complex phrase structure.

cmavo

The machine grammar includes rules which describe how each word is interpreted. A classification scheme categorizes each word based on what rules it is used in and how it interacts with other words in the grammar. All cmene are treated identically by the grammar, as are all brivla. The classification divides the cmavo of Lojban into about a hundred of these categories of grammar units, called selma'o. Whereas the three word types, namely cmene, brivla, and cmavo, are generally considered to correspond to the 'parts of speech' of English, these hundred-odd selma'o correspond to the more subtle variations in English grammar, such as the different kinds of pronouns, or the different ways of expressing the past tense of a verb. In this sense, English has hundreds of 'parts of speech'.

Lojban selma'o are named after one word within the category, often the one most frequently used. CU, KOhA, PU, and UI are examples of selma'o.

Note: The selma'o names are capitalized in English discussion of Lojban. The apostrophe is converted to h in such usage; this is for compatibility with computer grammar parsers.

bridi

The bridi is the basic building block of a Lojban sentence. bridi are not words, but concepts. A bridi expresses a relationship between several 'arguments', called sumti. Those with a background in algebra may recognize the word 'argument' in connection with 'functions', and a bridi can be considered a logical 'function' (called a predicate) with several 'arguments'. A brivla (bridi valsi = bridi-word) is a single word which expresses the relationship of a bridi.

The definition of a brivla includes a specific set of 'places' for sumti to be inserted, expressed in a certain order (called a place structure) to allow a speaker to clearly indicate which place is which. By convention, we number these places as: x1, x2, x3, x4, x5, etc., numbering from the left. Other letters may be used when referring to two or more place structures together.

The unique definition of a brivla is thus an enumeration of the component places in order, joined with a description of the relationship between them. The definition of the gismu klama 'come, go' can be expressed compactly as:

x1 comes/goes to x2 from x3 via x4 using x5

or in full detail:

  • x1 describes a party that acts with result of being in motion;

  • x2 describes a destination where x1 is located after the action;

  • x3 describes an origin where x1 is located before the action;

  • x4 describes a route, or points along a route travelled by x1 between x2 and x3;

  • x5 describes the means of transport by which the result is obtained.

Note: The difference between the English verbs come and go depends on the relationship between x2 (the destination), x1 (the origin), and the speaker. The position of the speaker is not part of the Lojban meaning.

When actually using a brivla within a bridi, it is possible to fill the places (five in the case of klama) with five specific sumti. Consider the following example:

Example 1.

le prenu cu klama le zdani le briju le zarci le karce
The person comes/goes to the house from the office via the market using the car.

The definition of the brivla used above, klama, shows this relationship. There are five places labelled x1 through x5. The brivla itself describes how the five places are related, but does not include values for those places. In this example, those places are filled in with five specific sumti values:

The brivla and its associated sumti, used in a sentence, have become a bridi. For logicians, the comparable English concept is called a predication. In each bridi, a brivla or tanru specifies the relationship between the sumti. Such a specification of the relationship, without the sumti expressed, is called a selbri (predicate in English). Whether or not any sumti are attached, a selbri is found within every bridi.

We express a bridi relationship in Lojban by filling in the sumti places, so that the position of the sumti in the place structure is clear, and by expressing the selbri that ties the sumti together.

It is not necessary to fill in all of the sumti to make the sentence meaningful. In English we can say I go, without saying where we are going. To say mi klama ("I go...") specifies only one sumti; the other four are left unspecified.

In Lojban, we know those four places exist; they are part of the definition of klama. In English, there is no implication that anything is missing, and the sentence I go is considered complete. As a bridi, mi klama is inherently an incomplete sentence. The omission of defined places in a bridi is called ellipsis; corresponding ellipsis in the natural languages is a major source of semantic ambiguity. Most Lojban expressions involve some amount of ellipsis. The listener, however, knowing that the omissions have occurred, has a means of asking directly about any specific one of them (or all of them), and resolving the ambiguity. So this kind of semantic ambiguity is not eliminated in Lojban, but it is made more recognizable and more amenable to resolution.

It is permissible to use a selbri alone, with no sumti filled in, as a very elliptical sentence called an observative. The sentence fagri is very similar to the English exclamation Fire!, but without the emotional content: it merely states that "something is a fire using some fuel", without explicitly specifying the identity of either.

bridi within bridi

You may have noticed that in example (1), each of the sumti filling the five places of klama contain a brivla. Each of these brivla are selbri as well; i.e. they imply a relationship between certain (usually unspecified) sumti places. A selbri may be labelled with le (among other things) and placed in a sumti. When le is used, the concept which the speaker has in mind for the x1 place of the selbri within the sumti is understood to fill that sumti place. For example, the sumti place for le prenu is filled with what the speaker has in mind as being the x1 place of prenu. Since prenu has the place structure "x1 is a person", le prenu thus corresponds to 'the person'.

In example (1), there are no places specified for any of the selbri embedded in the sumti; they are all elliptically omitted, except for the x1 place, which describes the sumti itself. Here is a more complex example:

Example 2.

mi sutra klama le blanu zdani be la djan. le briju
I quickly come to the blue house of John from the office.

More completely, this translates as:

I quickly (at doing something) come to the blue house of John from the office (of someone, at some location), via some route, using some means of travel.

In this example, one of the nested sumti selbri has had its places specified, while two places of klama have been elliptically omitted:

  • x1 of sutra klama contains mi (I)

  • x2 of sutra klama contains le blanu zdani be la djan. (the blue-house of the one named John)

    • x1 of blanu zdani contains the value which fills x2 of sutra klama; the thing which is a blue house

    • x2 of blanu zdani contains la djan. (the one named John)

  • x3 of sutra klama contains le briju (the office of someone, at some location)

  • The sumti for x4 and x5 of sutra klama are elliptically omitted.

Two of the places of the selbri in x3, briju, have also been elliptically omitted, and this is expressed in the more exact translation of the example.

Note that in the two tanru in example (2), sutra klama and blanu zdani, each of the four brivlamay be a self-contained selbri unit as well, having its own sumti attached to it (using the cmavo be). The place structure of the final component of a tanru (klama and zdani, respectively) becomes the place structure of the tanru as a whole, and hence the place structure of the higher level bridi structure. (The place structure of klama thus becomes the place structure of the sentence, while the place structure of zdani becomes the place structure of the x2 sumti.)

Place structures

A brivla must have a single defined place structure, describing the specific sumti places to be related. If this were not so, example (1) might be interpreted arbitrarily; for example, as "The person is the means, the office the route, the market is the time of day, the house is the cause, by which someone elliptically unspecified comes to somewhere (also elliptically unspecified)." Not only is this nonsense, but it is confusing nonsense. With fixed place structures, a Lojbanist will interpret example (1) correctly. A Lojbanist can also, incidentally, express the nonsense just quoted. It will still be nonsense, but it won't be the syntax that confuses the listener; each place will be clearly labelled, and the nonsense can be discussed until resolved.

Thus, for a given brivla, or indeed for any selbri, we have a specific place structure defined as part of the meaning. Complex selbri, described below, simply have more elaborate place structures determined by simple rules from their components.

The place structure of a bridi is defined with ordered (and implicitly numbered) places. The sumti are typically expressed in this order. When one is skipped, or the sumti are presented in a non-standard order, there are various cmavo to indicate which sumti is which.

Lojban bridi are most often given in a sentence as the value of the 1st (x1) sumti place, followed by the selbri, followed by the rest of the sumti values in order. This resembles the English Subject-Verb-Object (SVO) sentence form. It is shown schematically as:

[sumti]x1 [selbri] [sumti]x2 [sumti]x3 ... [sumti]xn

or abbreviated as:

x1 selbri x2 x3 x4 x5

This is the order used for the bridi sentences in examples (1) and (2). However, it is equally correct and straightforward to place the selbri at the end of the bridi:

x1 x2 x3 x4 x5 selbri

There are a variety of cmavo operators which modify these orders, or which modify one or more pieces of the bridi. These can make things quite complicated, yet simple rules allow the listener to take the complications apart, piece by piece, to get the complete and unique structure of the bridi. We cannot describe all of these rules here, but a couple of key ones are given.

Of these cmavo, cu is placed between a selbri and its preceding sumti in a sentence-bridi. cu cannot be used if there are no sumti before the selbri; but otherwise it is always permitted though not always required. Example (1) shows a cu used that is required; example (2) optionally omits the cu. Skill in Lojban includes knowing when cu is required; when it is not required but useful; and when it is permitted, but a distraction.

What happens when the place structure of a given bridi does not exactly match the meaning that the speaker is trying to convey? Lojban provides a way to adapt a place structure by adding places to the basic structure. The phrases that do so look exactly like sumti, except that they have a cmavo marker on the front (called a modal tag, or sumti tcita) which indicates how the added place relates to the others. The resulting phrase resembles an English prepositional phrase or adverbial phrase, both of which modify a simple English sentence in the same way. Thus I can say:

Example 3. ca le cabdei mi cusku bau la lojban.

  • ca le cabdei = an added sumti; modal operator ca indicates that the added place specifies 'at the time of...', or 'during...'; thus 'during the nowday', or 'today';

  • x1 = mi (I)

  • selbri = cusku (x1 expresses x2 to x3 in form/media x4)

  • x2, x3, and x4 are elliptically omitted;

  • bau la lojban = an added sumti; modal operator bau indicates that the added place specifies 'in language...'; thus 'in language which is called Lojban'.

The sentence thus roughly translates as "Today, I express [it] in Lojban."

Among additional bridi places that can be specified are comparison, causality, location, time, the identity of the observer, and the conditions under which the bridi is true. In Lojban, semantic components that can apply to any bridi, but are not always needed for communication (for instance, location and time), are left optional.

selbri

As described above, the simplest form of selbri is a brivla. The place structure of the brivla is used as the place structure of the bridi. Various modifications can be made to the brivla and its place structure using cmavo. These include ways to treat a single selbri as a state, an event, an activity, a property, an amount, etc. For example, jetnu, a selbri expressing that x1 is true, becomes the basis for ka jetnu, a selbri expressing the property of truth.

Place structures of a selbri can undergo 'conversion', which is simply a reordering of the sumti places. Since the listener's attention is usually focussed on the first and/or the last sumti expressed in the bridi, this has a significant effect in relative emphasis, somewhat like the 'passive voice' of English (e.g. The man was bitten by the dog. vs. The dog bit the man.)

As shown in example (2) above, tanru can also be selbri. These tanru can be composed of simple brivla, brivla modified by the techniques referred to above, or simpler tanru. tanru themselves can also be modified by the above techniques.

All of the possible modifications to selbri are optional semantic components, including tense. (Time and location, and combinations of the two, can be incorporated as tenses in the selbri.) With tense unspecified, examples (1) and (2) might be intended as past, present, or future tense; the context determines how the sentence should be interpreted.

sumti

sumti can be compared to the 'subject' and 'object' of English grammar; the value of the first (x1) sumti place resembles the English 'subject'; the other sumti are like direct or indirect 'objects'.

But as the discussion of bridi above will have indicated, this is only an analogy. sumti are not inherently singular or plural: number is one of those semantic components mentioned above that is not always relevant to communication, so number is optional in Lojban. Thus, example (2) could have been translated as We quickly go/come/went/came (etc.) to the blue houses of those called John. If this is plausible given the context, but is not the meaning intended, the speaker must add some of the optional semantic information like tense and number, to ensure that the listener can understand the intended meaning. There are several ways to specify number when this is important to the speaker; the numerically unambiguous equivalent of the English plural people would be: le su'ore prenu ('the at-least-two persons').

There are a large variety of constructs usable as sumti, beyond what we have already seen. Only the most important will be mentioned here. These include:

pro-sumti

cmavo which serve as short representations for longer sumti expressions (e.g. ko'a 'He/She/It1', ti 'this'); imperatives are also marked with a pro-sumti (ko 'You!');

anaphora/cataphora

back references and forward references to other sentences and their components (e.g. ri 'the last complete sumti mentioned', di'u 'the preceding utterance');

quotations

grammatical Lojban text, or text in other languages, suitably marked to separate the quote from the rest of the bridi (e.g. zo djan 'The word John', lu mi klama li'u 'The Lojban text mi klama', zoi by. I go .by. 'The non-Lojban text I go');

indirect reference

reference to something by using its label; among other things, this allows one to talk about another sentence ("That isn't true"), or the state referred to by a sentence ("That didn't happen"), unambiguously in all cases (e.g. la'e di'u na fasnu = "The referent of the last sentence does not occur", or "That didn't happen");

named references

reference to something named by using the name (e.g. la djan 'John', lai ford. '[the mass of things called] Ford');

descriptions

reference to something by describing it (e.g. le prenu 'the person', le pu crino 'the thing that was green in the past', le nu klama 'the event of going').

Pro-sumti, anaphora/cataphora, and indirect references are all equivalent to various uses of pronouns of English, and we won't be going into any further detail here. Quotations and named references are straightforward, and quite similar to their English counterparts. Lojban , however, allows a distinction between Lojban and foreign quotation, and between grammatical and ungrammatical Lojban quotation.

Descriptions appear similar to an English noun phrase (le prenu = 'The person'). For most purposes, this analogy holds. The components of a description are a 'descriptor' or gadri, and a selbri. As we've seen, by default such a sumti refers to what would be put into the x1 place (the 'subject') of its selbri. Thus le klama is 'the go-er (to some place from some place via some place, using some means of travel)', and le blanu is 'the blue thing'. With conversion, as described above, a speaker can access other places in the bridi structure as the new 'subject' or x1 place: le se klama is "the place gone to (by someone from some place via some place, using some means of travel)". Descriptions are not limited to selbri with attached sumti; as in example (2), they can include bridi with places filled in.

Abstract bridi such as events and properties can also be turned into sumti. These are among the more common descriptions, and a common source of error among new Lojbanists. If le klama is 'the go-er/come-r (to some place from some place via some place, using some means of travel)', le nu klama is the 'event of (someone) going/coming (to some place from some place via some place, using some means of travel)'. The abstraction treats the bridi as a whole rather than isolating the x1 place.

Descriptions can also incorporate sentences based on abstracts; this is needed to elaborate sumti like le nu klama. For example, le nu mi klama ti is 'the event that: I come here (from some place via some place, using some means of travel)', or simply 'my coming here'.

In addition to number, Lojban allows for mass concepts to be treated as a unit. This is equivalent to English mass concepts as used in sentences like Water is wet, and People are funny. Mass description also allows a speaker to distinguish, in sentences like Two men carried the log across the field, whether they did it together, or whether they did it separately (as in "One carried it across, and the other carried it back.")

Sets can be described in sumti, as well as logically and non-logically connected lists of sumti. Thus, Lojban provides for: "Choose the coffee, the tea, or the milk", or "Choose exactly one from the set of {coffee, tea, milk}". Note that English connectives are not truly logical. The latter is the common interpretation of "Coffee, tea, or milk?" and is relatively unambiguous. The former, if translated literally into Lojban, would be a different statement, because of the ambiguous meaning of English or.

Finally, sumti can be qualified using time, location, modal operators, or various other means of identification. Incidental notes can be thrown in, and pro-sumti can have values assigned to them. Lojban also has constructions that are similar to the English possessive.

Free Modifiers

Free modifiers are grammatical constructs that can be inserted in a bridi, without changing the meaning, or the truth value, of the bridi. Free modifiers include the following types of structures:

parentheses

Parenthetical notes, which can be of any length, as long as they are grammatical.

vocatives

These are used for direct address; they include several expressions used for 'protocol', allowing for smooth, organized communications in disruptive environments (e.g. ta'a 'excuse me', be'e 'are you listening?'), as well as some expressions that are associated with courtesy in most languages.

discursives

These are comments made at a metalinguistic level about the sentence, and about its relationship to other sentences. In English, certain adverbs and conjunctions serve this function (e.g. however, but, in other words).

discursive bridi

These are halfway between discursives and parentheses, and allow the speaker to make metalinguistic statements about a sentence without modifying that sentence. Thus, the discursive bridi equivalent of This sentence is false does not result in a paradox, since it would be expressed as a discursive bridi inside of another sentence, the one actually being described.

attitudinals

These are expressions of emotion and attitude about the sentence, being expressed discursively. They are similar to the English exclamations like Oh! and Ahhhh!, but they include a much broader range of possibilities, covering a range comparable to that expressed by English intonation; they can also serve as indicators of intensity. Also included in this category are indicators of the relationship between the speaker and the expression (evidentials). Found in native American languages among others, these allow the hearer to judge how seriously to take an assertion, by making explicit the basis for the speaker making the assertion: that the speaker directly observed what is being reported, heard about it from another, deduced it, etc.

Questions

The manner of asking questions in Lojban is quite different from English. In Lojban, most questions are asked by placing a question word in place of the value to be filled in by the person answering. The question word mo can be used in the grammatical place of any bridi, including those within sumti. It asks for a bridi (usually a selbri) to be supplied which correctly fills in the space. It is thus similar to English what? This booklet is titled la lojban. mo, meaning 'The thing called Lojban is what?', or, of course, 'What is Lojban?' The question word ma is used in place of a sumti in the same manner. Thus a listener can ask for ellipsis to be filled in, or can pose new questions that are similar to the classic English questions (who?, when?, where?, how?, and why?).

Yes/no questions can either be asked as a question of emotional attitude — such as belief, certitude, supposition, decision, approval, or intention — or as a question of truth and falsity. In the first case, the answer is an attitudinal. In the second case, the answer is an assertion or denial of the bridi being queried. Lojban also provides question words that can request a value filling many other grammatical functions.

Tenses

The tense system of Lojban expresses not only the time at which something happens, but also the place. It can express very complex combinations of both temporal and spatial distances and directions (the time directions being 'past' and 'future', of course), interval sizes and ranges, and parts of events such as 'beginning', 'middle', and 'end'. Fortunately, this entire system is optional: it is perfectly correct to express bridi with no specific tense at all, in which case the place and time is up to the listener to figure out.

Some examples of tenses in use are:

Logic and Lojban

Lojban supports all of the standard truth-functions of predicate logic. These can be used to connect any of several different levels of construct: sumti, bridi, selbri, sentences, etc.; the methods used indicate unambiguously what is being joined. As an example of English ambiguity in the scope of logical connectives, the incomplete sentence I went to the window and ... can be completed in a variety of different ways (e.g. ...closed it, ...the door, ...Mary went to the desk); in these, the and is joining a variety of different constructs. You must hear and analyze the whole sentence to interpret the and, and you still may not be certain of having a correct understanding. Lojban would make clear the structures being joined from the outset.

Another way Lojban supports logical connectives is by distinguishing them from non-logical connectives. The latter include:

Mathematics

Lojban has incorporated a detailed grammar for mathematical expressions. This grammar parallels the predicate grammar of the non-mathematical language. Numbers may be clearly expressed, including exponential and scientific notation. Digits are provided for decimal and hexadecimal arithmetic, and letters may be used for additional digits if desired. There is a distinction made between mathematical operations and mathematical relations. The set of operations is not limited to 'standard arithmetic'. Operations therefore assume a left-grouping precedence which can be overridden with parentheses, or optionally included precedence labels that override this grouping on evaluating the expression.

Included in Lojban are means to express non-mathematical concepts and quantities as numbers, and mathematical relationships as ordinary bridi. In Lojban, it is easy to talk about a 'brace' of oxen or a 'herd' of cattle, as well as to discuss the "5 fingers of your hand", or "∫ −2x3+x2−3x+5 dx evaluated over the interval of −5 to +5 bottles of water".

Note: In case you're curious: li ri'o ni'u re pi'i xy. bi'e te'a ci su'i xy. bi'e te'a re vu'u ci bi'e pi'i xy. su'i muboi ge'a xy.boi ge'a mo'e vei ni'u mu bi'o ma'u mu ve'o djacu botpi

selsku

The set of possible Lojbanic expressions is called selsku. Lojban has a grammar for multiple sentences tied together as narrative text, or as a conversation; the unambiguous Lojban grammar supports an indefinite string of Lojban paragraphs of arbitrary length. Using the rules of this grammar, multiple speakers can use, define, and redefine pro-sumti. Paragraphs, chapters, and even books can be separately distinguished: each can be numbered or titled distinctly. One can express logical and non-logical connectives over multi-sentence scope. (This is the essence of a set of instructions — a sequence of closely-related sentences.) Complex sets of suppositions can be expressed, as well as long chains of reasoning based on logical deduction. In short, the possibilities of Lojban grammatical expression are endless.