From wisdom!uunet.uu.net!grebyn!lojbab Tue Jun 2 20:13:00 1992 Return-Path: Received: by hourglas (/\==/\ Smail3.1.25.1 #25.1) id ; Tue, 2 Jun 92 20:12 EDT Received: from uunet.uu.net by wisdom.bubble.org id aa00449; Mon, 1 Jun 92 19:49:46 EDT Received: from uunet.uu.net (via LOCALHOST.UU.NET) by relay1.UU.NET with SMTP (5.61/UUNET-internet-primary) id AA27680; Mon, 1 Jun 92 04:41:21 -0400 Received: from grebyn.UUCP by uunet.uu.net with UUCP/RMAIL (queueing-rmail) id 044021.2346; Mon, 1 Jun 1992 04:40:21 EDT Received: by grebyn.com (5.57/smail2.3/07-01-87) id AA20767; Mon, 1 Jun 92 04:16:47 -0400 Received: by daily.grebyn.com (5.57/UUCP-Project/02.16.86-kan-10.20.91) id AA08743; Mon, 1 Jun 92 04:17:39 -0400 Date: Mon, 1 Jun 92 04:17:39 -0400 From: Logical Language Group Message-Id: <9206010817.AA08743@daily.grebyn.com> To: erikr@hourglas.UUCP, grebyn!sgi.com!rc@uunet.uu.net From wisdom!uunet.uu.net!grebyn!lojbab Tue Jun 2 20:13:02 1992 Return-Path: Received: by hourglas (/\==/\ Smail3.1.25.1 #25.1) id ; Tue, 2 Jun 92 20:13 EDT Received: from uunet.uu.net by wisdom.bubble.org id aa00452; Mon, 1 Jun 92 19:49:48 EDT Received: from uunet.uu.net (via LOCALHOST.UU.NET) by relay1.UU.NET with SMTP (5.61/UUNET-internet-primary) id AA27662; Mon, 1 Jun 92 04:41:17 -0400 Received: from grebyn.UUCP by uunet.uu.net with UUCP/RMAIL (queueing-rmail) id 044025.2370; Mon, 1 Jun 1992 04:40:25 EDT Received: by grebyn.com (5.57/smail2.3/07-01-87) id AA20807; Mon, 1 Jun 92 04:17:22 -0400 Received: by daily.grebyn.com (5.57/UUCP-Project/02.16.86-kan-10.20.91) id AA08751; Mon, 1 Jun 92 04:18:00 -0400 Date: Mon, 1 Jun 92 04:18:00 -0400 From: Logical Language Group Message-Id: <9206010818.AA08751@daily.grebyn.com> To: erikr@hourglas.UUCP, grebyn!sgi.com!rc@uunet.uu.net Is Lojban Scientifically Interesting David Pautler (pautler@ils.nwu.edu), challenged the scientific relevance of artificial languages. The following is lojbab's (Bob LeChevalier's) response. David wrote: > I did not say that ALs have no good use. I said there's >nothing particularly interesting about them (from a >scientific viewpoint ...) *because* they're artificial. >Some interesting sociological behaviors may appear if these >languages come into widespread use, perhaps even some >interesting linguistic phenomena if enough spontaneous >innovation occurs (although AL enthusiasts seem to want to >prevent this). But there certainly doesn't appear to be >anything interesting about them now, because AL enthusiasts >in this group prefer to argue over which of several (truly >arbitrary) conventions are "better". > I am willing to admit I am wrong about all this if some of >you AL enthusiasts can give the rest of us some good reasons >why ALs *are* scientifically interesting. and later added in clarification: >I still believe that knowing the design principles of any >system beforehand makes a scientific study of those >principles silly... lojbab's response: The added comment definitely clarifies the problem, especially since it removes the loaded topic 'AL' from the question. I will answer primarily from the standpoint of Lojban, though some of my points are applicable to Esperanto and other ALs. David is taking a very limited view of science, to presume that the design principles of a system are the only interesting thing about that system to a scientist. I can see a few other possibilities: a) in a highly complex system (which even an AL is), the interaction of the design features displays properties that are 'more than the sum of the parts'. Thus it is possible that all language is merely a system comprised of a bunch of neurons releasing neurotransmitters. Biochemistry may eventually devise a complete explanation for the neuronic process (including genetic components), and we may then say we "know the design principles of the system". But we won't know the system, because the complexity of those neuronic interactions is so great that knowing the pieces does not give a total understanding of the system. This indeed may be what defines the concept of 'system'. Knowing all the prescribed rules of an AL does not tell you how that AL is used communicatively, and I don't mean in the sociological sense. A sample question: Given multiple ways of communicating the same idea, do users of the language choose particular forms over others, and why? This is similar to a question that presumably is commonly asked about natural languages. I can come up with many other sample questions of science that can be applied to the system of an AL that are not compromised by 'knowing the design', but let's move on. (Feel free to ask, though). b) A simpler system, which can be more fully understood, may serve as an excellent model for a less understood, more complex system. Thus the simpler system could be examined for parallels to hypotheses about the more complex system. Examination of the simpler system may suggest properties to look for in the more complex system, or it may even suggest hypotheses that can be tested in the more complex system. A 'hot' topic in parts of the Lojban community is whether the language has or should have, an underlying semantic theory. If one exists, it is certainly not as developed or prescribed as the syntactic design and theory. Filtering out syntactic ambiguity allows a more direct examination of semantic ambiguities, including the properties of modification and restriction, resolution of anaphora, and identification of ellipses. Any semantic theories proposed for natural language can be looked at in terms of semantic usage in the simpler Lojban system. As a 'model of a natural language', it seems likely that any theory NOT true of Lojban is at least suspicious with regard to natural language, thus allowing partial verification of theories (not complete - I would never say that ALs should be studied to the exclusion of natural languages, but rather in relation to them). If the theory is true of natural language, then you have found evidence that Lojban is in some way unnatural. Then you try to explain which of the (fully-known) design features of Lojban causes this unnaturalness. By counterexample that design feature is not a feature of natural languages. You've learned something about natural language by studying an artifical one. As another example, pragmatic effects can be more easily recognized in the simpler Lojban system, and can be clearly identified as pragmatic. Thus, insights about pragmatic effects may be more visible in Lojban, insights that would then be tested in the natural languages. c) Another aspect of a simple system is that it is easier to perform experiments on than a more complex system. There are fewer variables, and if the system is 'designed', some things that are variables in complex systems are in effect tunable constants in the simple, carefully-designed system. You can then rerun the experiment with minor changes to explore the effects of those variables. Experimental linguistics of this sort is a virtually unthinkable possibility with the natural langauges. The Sapir-Whorf Hypothesis is not really testable in the natural languages since we can't control any variables, and we don't know what things about a language might be determining to a culture. Sapir-Whorf may be more testable when you can reduce or even control the variables with a language like Lojban. Let me be specific: Lojban is a predicate language, with no nouns, verbs, or adjectives. What are the linguistic (communicative) properties of such a system? The answer has been partially explored through symbolic logic. But do people thinking linguistically in any way mimic the processes of formal logic? What effects would a formal-logic-based language have on those linguistic thinking processes. Is the resulting language susceptible to the same analysis as natural language in terms of the various formal systems thatr have been developed by linguists over the past few decades? Given that natural language processing in computers usually involves converting natural language to some kind of predicate form in which deductions can be made, the validity of predicate logic as a tool for such analysis is already accepted. But how to you identify the logical deductions that a human being makes from a natural language statement. If thinking in Lojban, the human is already thinking using predicate logic structures, so that the deduction process is much more plain. Let me pose an experiment. Take even a few children during the critical period of language learning and teach them this artificial language (at the same time as they learn their traditional language). Do they become truly bilingual? If they are as fluently communicative in the AL as they are in their natural language, then the AL is a suitable linguistic model. Then, ANY theory of language that cannot extend to cover the features of the AL is inadequate. You could perform a series of experiments with ever more exotic artificial languages (obviously you need new speakers for each test). Sooner or later, either the model breaks and the AL is no longer acquirable by children and/or communicative as a language, or the theory breaks, and you've learned where to look for improvements in the theory. With only natural languages, you have to devise theories based on the available data, and then go look in other natural languages for confirmation or refutation. But this isn't the optimal kind of experimentation because you really cannot plan the experiment or control the variables (the other language may have the same apparent feature through a totally different process that you won't recognize because you aren't looking for it.) A language like Lojban is such an ideal test bed for experimentation, because it is flexible; you can evolve slightly different versions of the language very easily by simply changing some features. Forbid a given construct in the prescription, and do not teach it to a child. Does the child develop that construct anyway by analogy to other languages known, or does the child successfully adapt to whatever other processes you've designed into the language instead of the construct. It seems that all manner of linguistic universals could be investigated in this way. My remaining points are not necessarily specific to the 'system' nature of a language, but deal with David's original question on whether artificial languages are scientifically interesting. In general they rely on the assumption argued above that a model of a system is valuable for learning about the system. d) I've mentioned only child learning as revealing the essential nature of language, because this is what many linguists concentrate on. But there is also the important applied linguistics problems of teaching foreign languages. It is much easier to test a method or theory of vocabulary teaching/learning with an artificial language than with a natural language; I don't think the statement that ALs are more quickly (I didn't say easily - which is a subjective question) learned then NLs is particularly controversial; there have been experiments verifying this in the literature for decades. The pragmatic problems of language learning are alone justification into researching using ALs. But ALs may provide the solution as well as the means of testing. It seems to be well accepted that in learning a second language and then learning a third, you learn the third MUCH more quickly than the second. The example I've heard is this: Assume that it takes 4 years to learn French and then 2 to learn German thereafter; and vice versa. Let us assume that you can learn an artificial language in 1 year to a comparable degree as you can learn French. Then you can learn the AL and German in 3 years instead of 4, and all three languages in 5 years instead of 6. This gains a year EVEN IF YOU NEVER AGAIN USE THE AL. I don't claim this example as a fact - it should be easily testable in a controlled experiment, and this seems much more scientific than arguments about what ALs and NLs are 'easier to learn'. e) Lojban has one feature designed to explore a less- understood aspect of language - the expression of emotion. Lojban allows expressive communication of emotions in words without suprasegmentals (this presumably unlike all natural languages, but not entirely, as many languages have a limited set of indicators of attitude in the form of interjections and some discursive function words e.g. 'but'). Can human beings manipulate the symbols of emotion in the same way they manipulate the comparable symbols of non-emotional expression? There is a whole range of experimental questions raised by this design element, probably the most 'unnatural' element of Lojban's design. f) The latter points to the one other aspect of a well- designed artificial language of scientific interest and value to linguistics - as a tool of analysis. I present an example, based on the 1991 Scientific American Library book The Science of Words, by George A. Miller of Princeton. In the book, a picture caption notes that Nootka (a Pacific Northwest language) has the single word: "inikwihl'minik'isit" meaning the equivalent of the entire English sentence "Several small fires were burning in the house." I won't presume to know any more about Nootka than I've just told you, but in Lojban, I can express that sentence paralleling the English: so'i cmalu fagri puca jelca vine'i le Many small fires were-then burning at-within the prezda person-nest. and analytically as a single word (though not with the same structure as Nootka) prezdane'ikemcmafagyso'ikemprununje'a person-house-inside-type_of-small-fire-many_some-type_of- previous-burning (Yes, I can say it!) Actually, according to Miller, the Nootka breaks down as: inikw -ihl -'minih -'is -it fire/burn in-the-house plural diminuitive past- tense This order is also expressible in Lojban: fagykemprezdanerso'icmapru fire-type_of-person-nest-inside-many_some-small- past_thing/event I don't know which of the two orders more accurately conveys how the Nootka speaker thinks of the concept expressed by the word, or whether others would be better still. The Lojban in either case more accurately tracks the semantics of the Nootka, demonstrating the inadequacy of the English - the actual word as broken out did not require two separate particles for fire and burn as did the English equivalent, and the English translation used the more complicated tense "were-burning" instead of the simpler, and presumably more accurate "burnt". (I'll plainly admit that I'm relying on the given explanations by Miller, which are in English, but it seems clear that in translating the word- sentence into English there is a considerable ambiguity introduced. I won't claim that Lojban can express everything in the natural form of any language. Lojban has a less-marked syntactic word-order, and expressing other orders requires marking particles that would not be found in the source lan- guage. Thus there is a tradeoff between semantic representation and syntactic representation. Still, I think a convincing case can be made that, as a predicate language, Lojban is a much more effective tool at studying both the forms and semantics of other languages than is English, which has its own cultural, syntactic and semantic complexities to gum up the analysis. This is especially true for analysis by non-native English speaking linguists - if there is any place where there is a justification for an international, minimal-culture language, it is when linguists from different native language backgrounds try to perform and communicate their linguistic analyses. g) There is also the 'other' tool aspect of an artificial language, in computer and artificial intelligence (AI) applications. I mentioned the similarity in c) above between Lojban and the internal representations used in natural language processing by computers. A predicate language like Lojban should be especially amenable to AI processes - the programmers are familiar with predicate language expression and manipulation, and often store the data in predicate form internally for manipula- tion. With Lojban, such storage becomes a fairly trivial process. If Lojban is proven by experiment (per above) to have the systemic properties of a natural language, and is easier to implement in computational linguistics research problems, it serves as a tool to bridge those two disciplines, leading to more rapid and effective natural language processing. But only if it is tried. Even if it proves less than ideal, I have little doubt that study of natural language using computational linguistic techniques and a Lojban-based tool will be productive in ways not possible with any natural language. (In effect, this argument is the same as f), except that instead of two different-natural-language speakers trying to communicate about language, you have a human and a computer, who obviously speak different native languages, trying to communicate.) h) A highly prescribed language is an ideal test bed for examining the processes of language evolution. In the case of an AL like Lojban, as the speaking community in each culture grows, you can observe how the language creolizes in contact with those other languages. Because of the speed of learning, artificial languages should tend to show effects more quickly (by being mastered to a communicative level more quickly). Anecdotal evidence about Esperanto supports this idea. Does this mean that the conclusions are absolutely valid for natural language evolutionary processes? I don't claim so. But again, we are performing experiments with a model, somewhat idealized, of a natural language. Unlike a paper- theoretic model (as all linguistic theories must inherently be), this is a model that can be experimented with using live speakers. Provided that we understand the model as it evolves, that understanding much more approximates an under- standing of natural language as time goes on. i) The large majority of languages have some degree, more or less, of prescription. In addition, some 'natural' languages, like modern Hebrew, formal Swahili, and some standardized dialects (e.g. Mandarin, which has been noted as being related to but not identical to the Beijing dialect), are not all that far from being true artificial languages, but are much more interesting to linguists. A predominantly prescribed language would seem an especially effective tool for studying the effects of prescription on language development and use (again, I refer to linguistic and not sociological effects). Such studies may aid in first-language education as well as second-language acquisition. They may also aid in analyzing the development of different registers (usages based on social class and situation) of a single language: such registers can be interpreted as reactions to prescrip- tive environments that constrain language use. None of these scientific applications of Lojban inherently requires a large fluent body of speakers, or any solely- native speaker of that tongue. If any of the less scientific applications of Lojban serve to justify it developing such a speaker base, the nature of Lojban's use- fulness as a model will change. New applications, as yet not really predictable, will turn up, aided by our no doubt increased understanding of language. But the model, even if well understood, no longer is as simple, and new Loglans and other experimental linguistic tools, all artificial languages, will be developed to take the next step. I have hopefully given a bit of food for thought, yet with only a few hours preparation. I also only thought about this as somewhat an outsider to the profession of linguistics. With a different point-of-view others should be able to find many more questions of scientific interest using an AL like Lojban either as a model, an experimental test bed, or a tool. And if even a small fraction of these ideas are useful, then ALs have a valid scientific role in linguistics.