The following is an explanation of a set of proposed grammar changes to the baseline. The changes are described only. Send a message to cowan@snark.thyrsus.com if you want a diff listing showing the actual changes to the YACC grammar. All of these changes are extremely minor. They were found in the effort to develop the EBNF description and then to produce a working Lojban parser. This process picked up a few errors and inconsistencies that were missed by less systematic analysis. The last few changes reflect discussions of semantics that picked up "holes in the system". Nearly all of these changes are additions to the language that do not take away from the set of utterances that are legal. A couple will change the grouping of such sentences in complex situations. The set of changes constitutes a baseline change proposal, which will probably be decided at or before LogFest '91. The changes are presumed to be non-controversial, and John is building his parser on the assumption that the changes will be approved. Each change is explained in a three-part format: CURRENT LANGUAGE; PROPOSED CHANGE; RATIONALE. Executive Summary: 1 JOIK connection between operands 2 Multiple EK_KEs between operands 3 Reorder BIhI GAhO GAhO to GAhO BIhI GAhO 4 Remove GAhOs in parentheses 5 NA SE without NAI in afterthought connectives 6 Negation/conversion of BIhI 7 KI by itself and after BAI 8 *ANNULLED* 9 GIhEK_KE priority change 10 *ANNULLED* 11 Attach freemods to tense_modal, not PU_mod 12 Allow ZI PU and VI FAhA 13 Change utterance ordinals to free modifiers 14 Allow only one NAhE before tense 15 *ANNULLED* 16 *ANNULLED* 17 Allow forethought JOIKs 18 Allow BU to suffix any word to produce a BY 19 Remove MEX relations 20 Allow stand-alone ZAhO in tenses 21 *ANNULLED* 22 Make CU optional rather than elidable 23 Change purpose of FAhO; make it a universal end-of-input 24 *ANNULLED* 25 Make PEhO (Polish notation flag) optional 26 Allow directional and non-directional tense intervals 27 Allow full MEX in subscripts and quantifier selbri 28 Add flag for modal conversions Change 1: CURRENT LANGUAGE: Currently, logical connection of operands in the MEX grammar is allowed using EKs. However, JOIKs are not usable in MEX. PROPOSED CHANGE: Allow JOIKs as well as EKs on the same grammatical level RATIONALE: 1) Operands are the formal analogues of sumti, and this change makes operand connection formally identical to sumti connection, so that it can be learned by analogy without a special exception. 2) Ranges ("from 3 to 10") can be easily expressed using lexemes BIhI and GAhO, which are part of the JOIK system. Currently, these can only be expressed by a messy variation on left and right parentheses, which doesn't work well because no separator is defined between the upper and the lower bound. Change 2: CURRENT LANGUAGE: Only one EK_KE construction is allowed after a MEX operand. You cannot say "pa .a ke ri .e ci ke'e .a ke vo .e mu" to mean "1 or (2 and 3) or (3 and 4). PROPOSED CHANGE: Allow more than one consecutive EK_KE construct. RATIONALE: 1) same as 1) for Change 1. 2) This change amounts to changing an "operand_C" to an "operand_B". The baselined version was created by incorrectly copying existing text from the pre-baseline grammar, so this change is a "bug fix". Change 3: CURRENT LANGUAGE: In expressing intervals with explicit end-markers, the order is BIhI GAhO GAhO, where the first GAhO is the left endpoint and the second one is the right endpoint. PROPOSED CHANGE: Put the first GAhO before the BIhI RATIONALE: Make this form more consistent with the logical connectives like "na.anai", where the marker for the left connectand precedes the connector. Change 4: CURRENT LANGUAGE: MEX ranges are handled with GAhO operators attached to mathematical parentheses. PROPOSED CHANGE: Remove this capability. RATIONALE: See Change 1. This capability was never correctly specified, because only one expression can appear between parentheses, whereas ranges require two expressions inherently. Change 5: CURRENT LANGUAGE: It is possible to specify either NA or SE before lexemes A, JA, GIhA, or ZIhA, but they cannot both be specified unless -NAI follows. PROPOSED CHANGE: Remove this restriction. RATIONALE: The intent of a previous change just before the baseline was to allow both NA and SE (in that order) in all cases, not just those where -NAI followed. This ability was accidentally omitted, so this is a "bug fix". Change 6: CURRENT LANGUAGE: Lexeme JOI can be converted with SE and negated with NAI like the logical connectives, but the closely related lexeme BIhI cannot. PROPOSED CHANGE: Allow conversion and negation of BIhI. RATIONALE: Converted ranges allow "se bi'o" which means "to...from..." and negated ranges allow "bi'inai" which means "not between". Change 7: CURRENT LANGUAGE: KI can be used either on an origin specifier or on a time and/or space tense to reset the scope or position of the origin. KI by itself is ungrammatical. PROPOSED CHANGE: Allow KI by itself. This returns the origin to the physical here and now. Also allow KI after BAI to set a default aspect value; "BAI KI sumti" sets the BAI aspect to the sumti, and "BAI KI KU" resets the aspect to its default. RATIONALE: This capability existed in the pre-baseline grammar, and was omitted in error during the tense redesign. Change 8: *ANNULLED* Change 9: CURRENT LANGUAGE: GIhEK_KE constructs have lower priority than basic GIhEKs. PROPOSED CHANGE: Place GIhEK_KE constructs at the highest priority among GIhEKs. RATIONALE: This is the scheme used by sumti and operand connection, where EK has the lowest priority (and is left-binding), EK_BO has medium priority (and is right-binding), and EK_KE has highest priority (and is again left-binding). During the split between Institute Loglan and Lojban, sumti were changed to make EK_KE highest priority (and operands followed when MEX was redesigned) but bridi-tails were not changed. Change 10: *ANNULLED* Change 11: CURRENT LANGUAGE: The grouping of PU_mods means that a free modifier at the end of a PU_mod applies to the whole PU_mod rather than just to the tense_modal at the end, whereas free modifiers embedded within the PU_mod refer only to the tense_modals they follow. So "puxipa je puxire", which should mean "past-time t1 or past-time-t2" means "(past-time t1 or some-past-time)-sub-2". As a result, there is no way to subscript a conjoint tense, but it is not clear what such subscripts would mean anyhow. PROPOSED CHANGE: Move the free modifier to tense_modal. RATIONALE: See CURRENT LANGUAGE section. Change 12: CURRENT LANGUAGE: An initial FAhA cannot be followed by space offsets, but only by a space interval (or by nothing at all). Analogously for a ZI in the time system. PROPOSED CHANGE: Allow FAhA followed by space-time-offsets and ZI followed by time offsets. RATIONALE: This allows the currently ungrammatical "vizu'a" in the sense of "to the left of a nearby point". "Zu'avi" on the other hand means "a point not far to the left of here". This distinction is subtle, but real. The change to the time system follows by symmetry, although initial ZI is probably not of much use, since it means "a short/medium/long time distance from now" without specifying either past or future. Change 13: CURRENT LANGUAGE: Utterance ordinals using MAI are currently considered indicators, and can appear after any word and get absorbed. PROPOSED CHANGE: Shift MAI constructs to the more restrictive free modifier group. RATIONALE: The absorber routines in the parsing program which need to remove non-initial utterance ordinals before YACC sees them have to read an arbitrary number of PA or BY tokens, determine whether the next token is a MAI, and if so absorb, but if not push back all the PA/BY stuff. This requires unbounded pushback capability in the absorber, which is to be avoided. This change was proposed earlier but never consummated. A side effect of this change is that lexer_lexeme_A now flags utterance ordinals only, and the regular indicators (UI, CAI, Y) no longer need lexer flagging. Another side effect is that FUhO, DAhO, and POhA can now be treated as indicators (and PEhA as a forethought indicator like BAhE) rather than with special magic. Change 14: CURRENT LANGUAGE: A tense can be prefixed with arbitrary numbers of NAhE tokens. PROPOSED CHANGE: Allow only one NAhE token at most. RATIONALE: The compounder needs to read past a potentially infinite number of NAhEs to decide whether what follows is a selbri (which is not compounded) or a tense. If this change is made, the compounder will always be able to decide within 2 tokens whether it has a compound or not. If multiple NAhEs are really needed, the tense can be expanded to use the predicate grammar instead. Change 15: *ANNULLED* Change 16: *ANNULLED* Change 17: CURRENT LANGUAGE: Logical operators can be represented in either forethought or afterthought (except for tenses and abstractors), as can aspectual (BAI) operators, but the non-logical operators of JOI and BIhI have no forethought versions. PROPOSED CHANGE: Allow "[SE] JOI GI [NAI]" and "[SE] BIhI GI [NAI]" as new kinds of geks, analogous to the existing "stag GI [NAI]". Forethought would still be disallowed in tanru (no GUhEK equivalent of this) and where the GAhO endpoint markers are required. RATIONALE: Completeness, plus the fact that natural languages such as English usually represent joiks with forethought constructs ("the union of...and...", "from...to...", etc.) Institute Loglan had only one joik, "ze" (the equivalent of "joi"), so a forethought construction was not felt necessary. The far more elaborate joiks of Lojban can easily be extended to forethought. Change 18: CURRENT LANGUAGE: "bu", lexeme BU, has a very restricted use. It can only appear after bare vowels (lexemes A, I, and Y) to create the lerfu for those vowels. PROPOSED CHANGE: Allow "bu" after any (lexable) word whatever, to create something equivalent to lexeme BY. Remove the ZAI...FOI construct for change of character set, as well as the TEI construct. LAU is kept and extended to hold all lerfu prefixes, including "zai" to specify character set and "tau" to force a next-lerfu shift. Composite symbols are now represented by TEI letteral ... FOI, which has the grammar of a single letteral. All the various kinds of letterals are now also allowed in numbers (although not initially). RATIONALE: This allows the creation of a bunch of new lerfu. The Latin and Greek alphabets can be more readily accomodated; for example, "q" could have "kybu" as its lerfu. Lerfu for the digits become possible; for example "pabu" would be the digit 1, as opposed to the number 1. The ZAI...FOI construct is meant to specify new character sets, but requires spelling out the name of the character set in lerfu, for example "zai dy ebu vy abu ny abu gy abu ry ibu foi" to enable Devanagari mode. This is ugly. Using the new flexibility of "bu", we can say "zai .devanagar. bu" instead. (The pauses are needed for morphological reasons.) Change 19: CURRENT LANGUAGE: There is a special category of predicates called "MEX relations" which have special grammar; they represent mathematical relations. MEX relations are also used to specify the precedence of MEX operators using a free-modifier construct starting with TIhO. PROPOSED CHANGE: Assimilate MEX relations to ordinary predicates. Eliminate the special TIhO grammar, and move "ti'o" itself to selma'o SEI. RATIONALE: MEX relations as defined cannot be logically connected and overlap ordinary predicates. The only MEX relation cmavo defensible on Zipfean grounds is "du", which is moved to selma'o GOhA. The grammar of TIhO was an attempt to allow a "smart" parser to understand MEX operator precedence declarations, since they override the default grammar of MEX. This grammar was not well thought out, and it seems better to allow any selbri after "ti'o", the same grammar as for SEI. Smart parsers will just have to do the best they can. Change 20: CURRENT LANGUAGE: ZAhOs cannot stand alone in interval modifiers in tenses. An interval modifier currently consists of ROI/TAhE constructs with interspersed ZAhOs. PROPOSED CHANGE: Allow ZAhOs to stand alone or to come first in complex interval modifiers. RATIONALE: Stand-alone ZAhO has a clear meaning and should be allowed. The only remaining restriction is that consecutive ROI/TAhE options are not permitted without at least one intervening ZAhO. Change 21: *ANNULLED* Change 22: CURRENT LANGUAGE: CU is treated like the elidable terminators, although it does not terminate. PROPOSED CHANGE: Make CU optional (handled directly by grammar rules) rather than elidable. This does not change the grammatical status of any text; it is simply a change to internal mechanisms in the parser. RATIONALE: This change assists correct error recovery in the parser. Change 23: CURRENT LANGUAGE: FAhO is currently an elidable terminator for the end of text. Text refers not only to an entire expression, but also to quoted (with LU/LIhU) and parenthesized (with TO/TOI) material. In addition, FAhO is allowed at the end of a very-long-scope sentence group marked with TUhE/TUhU. In all three of these cases, the FAhO is redundant to the regular elidable terminator. PROPOSED CHANGE: Treat FAhO extra-grammatically as an overriding end of parsable input. It would no longer be allowed at the end of quoted or parenthesized text. RATIONALE: This change assists correct error recovery in the parser. It is also closer to the original spirit of FAhO, which was intended to assist mechanical Lojban users in determining when to terminate input (similar to the RETURN or ENTER keys in more conventional programs). Change 24: *ANNULLED* Change 25: CURRENT LANGUAGE: The flag "pe'o", selma'o PEhO, is currently used to mark forethought MEX operators. PROPOSED CHANGE: Make "pe'o" optional. RATIONALE: "pe'o" is not needed to keep the grammar unambiguous, but may still be helpful as a heuristic to avoid confusing human readers. Change 26: CURRENT LANGUAGE: A time interval specified with ZEhA must be preceded by a time direction specified with PU. An interval without a direction looks like an origin specification and is not allowed. Space intervals have a similar restriction. PROPOSED CHANGE: Allow bare ZEhA as a time interval and bare VIhA, VEhA or VEhA+VIhA as a space interval. Remove origin specifications from the language, and space interval, with no directions specified. If a direction is wanted for the interval, allow it after the interval word. Remove origin-size specifications as a special mechanism; origin sizes are set using KI, just like origin locations. RATIONALE: There is no logical reason why intervals must have a direction. The sentence "mi ve'ivi'u xadni" meaning "I small-interval-ly-three-dimensionally am-a-body" is ungrammatical without this change, but is perfectly sensible. Change 27: CURRENT LANGUAGE: The current use of full mathematical expressions is limited to two areas: after LI to form sumti, and as quantifiers. In the latter use, parentheses must be used around any MEX other than a simple number. Simple numbers and letteral-strings can also be used in some other places: with -MOI to form selbri, with -MAI to form utterance ordinals, and with -ROI to form quantified tenses, and after XI to make subscripts. PROPOSED CHANGE: Allow richer expressions after XI and before MOI. The grammar of XI is extended to allow XI VEI mex /VEhO/ as a subscript, allowing any MEX within parentheses (same rule as for quantifiers). With -MOI, the rule has to be more complex, since simply allowing any quantifier + MOI produces conflicts. Instead, we extend the syntax of ME-conversion so that ME sumti /MEhU/ MOI is a legal kind of selbri. Typically the sumti will be either a LI construct, a "le ni" construct, or some kind of anaphora. Change 28: CURRENT LANGUAGE: Official doctrine states that the sumti tcita of a bridi constitute nonstandard places which are co-equal with the regular numbered places. However, there is no way to make these places the subject of a description by moving them into a numbered (specifically, the x1) place. PROPOSED CHANGE: Add JAI+tag as the equivalent of a SE conversion. (JAI is a new selma'o.) This is usable only on selbri, not in the other places where SE is legal. The result is that the tcita sumti comes to occupy the x1 place, and the original x1-x5 places are "pushed down" to x2-x6. RATIONALE: It is currently messy to say "the time of my going to the store"; this looks like an abstraction, but does not match any existing abstractor. It can be handled quite neatly with "le jai ca klama be mi bei le zarci". In particular, when a place is meant to be an abstract sumti, and a concrete sumti appears ("sumti raising"), these JAI-based descriptors provide sumti access to modal places as well as standard ones.