WikiDiscuss

WikiDiscuss


PEG Morphology Algorithm

posts: 14214
Use this thread to discuss the PEG Morphology Algorithm page.
posts: 14214

This grammar classifies words by their morphological class (cmene, gismu, lujvo, fuhivla, cmavo, and non-lojban-word). It does not sort them into grammatical classes (CMENE, BRIVLA, A, BAI, BAhE, ..., ZOhU).

Why not? Mine certainly does.

-Robin

posts: 1912

Robin:
> Re: PEG Morphology Algorithm
> ''This grammar classifies words by their morphological class (cmene, gismu,
> lujvo, fuhivla, cmavo, and non-lojban-word). It does not sort them into
> grammatical classes (CMENE, BRIVLA, A, BAI, BAhE, ..., ZOhU).''
>
> Why not? Mine certainly does.

Mainly to save myself a lot of typing. :-)

Also, because I want to keep separate the two things which are
conceptually different. The transition to grammatical classes
could be done as follows:

CMENE <- cmene
BRIVLA <- gismu / lujvo / fuhivla
A <- &cmavo (a / e / o / u / j i)
....
ZOhU <- &cmavo z o h u
CMAVO <- !A !BAI ... !ZOhU cmavo

(I am accepting commas anywhere, things like {b,roda}, {co,i} etc.
I'm not clear on what the official comma rules outside of cmene are.)

Of course, an actual parser can jump over some steps in order to
be more efficient, but this is intended to be conceptually clear
rather than efficient.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 1912


I have the rule:

CVV-rafsi

posts: 1912


> Re: PEG Morphology Algorithm
>
> I have the rule:
>
> CVV-rafsi

I don't know what happened to the rest of what I had written...

Let's try again:

CVV-rafsi <- consonant vowel h? vowel

which allows for example {voe} as a rafsi. This is like
{vo'e}, of rafsi form but not actually assigned.

The alternative would be to make the rule:

CVV-rafsi <- consonant (vowel h vowel / a i / a u / e i / o i)

So for example {voebra} will then be rejected as a lujvo form,
but allowed as a fuhivla.

The choice has more drastic consequences for other words:

If {voebra} is of lujvo form, then {zvoebra} fails the slinku'i
test and so is not a valid fuhivla, otherwise, it is a valid fuhivla.

Conversely, if {voebra} is a valid lujvo, so is {tozvoebra}.
Otherwise {tozvoebra} breaks down as {to zvoebra}.

My preference at the moment is to allow these pseudo-rafsi and
pseudo-lujvo because that makes the rules simpler. Any other
opinions?

mu'o mi'e xorxes






__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


On Thursday 16 December 2004 18:51, Jorge "Llambías" wrote:
> --- wikidiscuss@lojban.org wrote:
> > Re: PEG Morphology Algorithm
> >
> > I have the rule:
> >
> > CVV-rafsi
>
> I don't know what happened to the rest of what I had written...
>
> Let's try again:
>
> CVV-rafsi <- consonant vowel h? vowel
>
> which allows for example {voe} as a rafsi. This is like
> {vo'e}, of rafsi form but not actually assigned.
>
> The alternative would be to make the rule:
>
> CVV-rafsi <- consonant (vowel h vowel / a i / a u / e i / o i)
>
> So for example {voebra} will then be rejected as a lujvo form,
> but allowed as a fuhivla.
>
> The choice has more drastic consequences for other words:
>
> If {voebra} is of lujvo form, then {zvoebra} fails the slinku'i
> test and so is not a valid fuhivla, otherwise, it is a valid fuhivla.
>
> Conversely, if {voebra} is a valid lujvo, so is {tozvoebra}.
> Otherwise {tozvoebra} breaks down as {to zvoebra}.
>
> My preference at the moment is to allow these pseudo-rafsi and
> pseudo-lujvo because that makes the rules simpler. Any other
> opinions?

valfendi does not allow {voe} as a rafsi, thus {voebra} is a fu'ivla,
{zvoebra} is also a fu'ivla, and {tozvoebra} breaks up. I put the whole table
of adjacent characters in it.

The same applies if the second of these three-letter groups has a non-lujvo
vowel pair: {kankua} is a fu'ivla, {ckankua} is a fu'ivla (it means "skunk"),
and {packankua} breaks up.

phma
--
li ze te'a ci vu'u ci bi'e te'a mu du
li ci su'i ze te'a mu bi'e vu'u ci


posts: 1912


I'm allowing a "y" after any CVC-rafsi no matter what consonant
follows. So I allow {selyma'o} as well as {selma'o}.

I'm not sure if there was a rule against this, but the restriction
is not required for unambiguity, and implementing it would
complicate the rules enormously, so I'm not doing it.

mu'o mi'e xorxes

posts: 1912


> The same applies if the second of these three-letter groups has a non-lujvo
> vowel pair: {kankua} is a fu'ivla, {ckankua} is a fu'ivla (it means "skunk"),
> and {packankua} breaks up.

OK, I'm modifying my rules to bring them in line with that.

Another question:

CLL says:

(1) "It is always legal to use the apostrophe (IPA h) sound in
pronouncing a comma."

(2) "Commas are never required: no two Lojban words differ solely
because of the presence or placement of a comma."

(3) "There exist 16 diphthongs in the Lojban language. ... Diphthongs
always constitute a single syllable."

This seems to lead to contradiction. If {kanku,a} can be
pronounced like {kanku'a} by (1) and the comma is not required
by (2) so it is equivalent to {kankua}. How many different
words are among {kanku'a}, {kanku,a} and {kankua}, and which
are they?

mu'o mi'e xorxes






__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


On Thursday 16 December 2004 19:33, Jorge "Llambías" wrote:
> Another question:
>
> CLL says:
>
> (1) "It is always legal to use the apostrophe (IPA h) sound in
> pronouncing a comma."
>
> (2) "Commas are never required: no two Lojban words differ solely
> because of the presence or placement of a comma."
>
> (3) "There exist 16 diphthongs in the Lojban language. ... Diphthongs
> always constitute a single syllable."
>
> This seems to lead to contradiction. If {kanku,a} can be
> pronounced like {kanku'a} by (1) and the comma is not required
> by (2) so it is equivalent to {kankua}. How many different
> words are among {kanku'a}, {kanku,a} and {kankua}, and which
> are they?

(1) is a leftover from Loglan, or a confusion because of the Loglan
compatibility orthography, or something like that, and is incorrect.
Diphthongs are pronounced as one syllable, unless there is a comma between
the vowels, but the comma makes no difference to whether it's a valid word,
or which one (the question of which one is moot with cmene, since they can be
polysemous). So {kankua} and {kanku,a} are the same word, though stressed on
different syllables, and {kanku'a} is different. I asked the same question
when I was developing valfendi.

phma
--
S Fa1>+/- !TM M-- K H T-- t? AT++ SY Te- SC- FO- D P !Tz E++ L


posts: 1912


> >
> > (1) "It is always legal to use the apostrophe (IPA h) sound in
> > pronouncing a comma."
> >
> > (2) "Commas are never required: no two Lojban words differ solely
> > because of the presence or placement of a comma."
> >
> > (3) "There exist 16 diphthongs in the Lojban language. ... Diphthongs
> > always constitute a single syllable."
>
> (1) is a leftover from Loglan, or a confusion because of the Loglan
> compatibility orthography, or something like that, and is incorrect.

OK, good.

> Diphthongs are pronounced as one syllable, unless there is a comma between
> the vowels, but the comma makes no difference to whether it's a valid word,
> or which one (the question of which one is moot with cmene, since they can be
> polysemous). So {kankua} and {kanku,a} are the same word, though stressed on
> different syllables, and {kanku'a} is different.

What happens with things like {prua} vs. {pru,a}.
{prua} can't be a valid fuhivla because it has only one syllable,
but {pru,a} seems to fill all the requisites for valid fuhivlahood.

> I asked the same question
> when I was developing valfendi.

Yes, I vaguely remember, but I didn't remember the answers.

mu'o mi'e xorxes



__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 14214

On Thu, Dec 16, 2004 at 04:12:04PM -0800, wikidiscuss@lojban.org
wrote:
> Re: PEG Morphology Algorithm
>
> I'm allowing a "y" after any CVC-rafsi no matter what consonant
> follows. So I allow {selyma'o} as well as {selma'o}.
>
> I'm not sure if there was a rule against this, but the restriction
> is not required for unambiguity,

  • UUHHHH*.


What you just wrote is "se ly ma'o", so far as I can tell.

-Robin


On Friday 17 December 2004 02:54, Robin Lee Powell wrote:
> On Thu, Dec 16, 2004 at 04:12:04PM -0800, wikidiscuss@lojban.org
>
> wrote:
> > Re: PEG Morphology Algorithm
> >
> > I'm allowing a "y" after any CVC-rafsi no matter what consonant
> > follows. So I allow {selyma'o} as well as {selma'o}.
> >
> > I'm not sure if there was a rule against this, but the restriction
> > is not required for unambiguity,
>
> *UUHHHH*.
>
> What you just wrote is "se ly ma'o", so far as I can tell.

No it's not. {se ly ma'o} requires a pause between {ly} and {ma'o}.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


On Thursday 16 December 2004 21:29, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > Diphthongs are pronounced as one syllable, unless there is a comma
> > between the vowels, but the comma makes no difference to whether it's a
> > valid word, or which one (the question of which one is moot with cmene,
> > since they can be polysemous). So {kankua} and {kanku,a} are the same
> > word, though stressed on different syllables, and {kanku'a} is different.
>
> What happens with things like {prua} vs. {pru,a}.
> {prua} can't be a valid fuhivla because it has only one syllable,
> but {pru,a} seems to fill all the requisites for valid fuhivlahood.

The commas are ignored, so {pru,a} is invalid, but {prae} is valid. valfendi
currently says that {prua} is invalid but {pru,a} is valid, which is a bug.

phma
--
Mes règles mensuelles ont lieu une fois par an.
-Les Perles de la médecine


wikidiscuss@lojban.org scripsit:
> Re: PEG Morphology Algorithm
>
> I'm allowing a "y" after any CVC-rafsi no matter what consonant
> follows. So I allow {selyma'o} as well as {selma'o}.
>
> I'm not sure if there was a rule against this, but the restriction
> is not required for unambiguity, and implementing it would
> complicate the rules enormously, so I'm not doing it.

Currently there is no such thing as an optional y-hyphen; all hyphens
(both -y- and -n-/-r-) are either required or forbidden. Whether
this matters depends on whether your grammar is intended to be
definitional (in which case it has to get this right) or only an
implementation (in which case it is allowed to have bugs).

--
John Cowan <jcowan@reutershealth.com> http://www.ccil.org/~cowan
"One time I called in to the central system and started working on a big
thick 'sed' and 'awk' heavy duty data bashing script. One of the geologists
came by, looked over my shoulder and said 'Oh, that happens to me too.
Try hanging up and phoning in again.'" --Beverly Erlebacher


posts: 1912


> > I'm allowing a "y" after any CVC-rafsi no matter what consonant
> > follows. So I allow {selyma'o} as well as {selma'o}.
> >
> > I'm not sure if there was a rule against this, but the restriction
> > is not required for unambiguity,
>
> *UUHHHH*.
>
> What you just wrote is "se ly ma'o", so far as I can tell.

That requires a pause after ly. Otherwise, {selyli'a} would
be {se ly li'a} too. Since CVCy is obligatory with some following
consonants, it won't be ambiguous if it's allowed with any following
consonant.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 1912


> Currently there is no such thing as an optional y-hyphen; all hyphens
> (both -y- and -n-/-r-) are either required or forbidden.

That's what I thought I remembered. The rule makes some sense for
the n/r-hyphens, but I don't see the point of it for the y-hyphen.

> Whether
> this matters depends on whether your grammar is intended to be
> definitional (in which case it has to get this right) or only an
> implementation (in which case it is allowed to have bugs).

It's intended to be definitional, but not necessarily in accordance
with the current official definition. :-)

Since some things in the grammar definition are changing anyway,
it's not a big deal to adjust these small details.

One other thing that bothers me in the official definition is
the restriction for ntc/nts/ndj/ndz in lujvo but not in cmene or
fuhivla.

Currently the rules as I wrote them handle this as in the official
prescription, but this lujvo-only restriction is jarring.
If they are not pronounceable, then they should not be allowed
in cmene and fuhivla either. If they are pronounceable, they
should be allowed in lujvo. (My preference would be the latter.)

mu'o mi'e xorxes


__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


Jorge Llamb��)B�as scripsit:

> CLL says:
>
> (1) "It is always legal to use the apostrophe (IPA h) sound in
> pronouncing a comma."

The rationale for this is that "ae"/"a,e" (for example) is a sequence
that occurs only in "foreign words", and that a "native-speaker" Lojbanist
should be able to pronounce it with the nearest "native" analogue,
namely "a'e". This is not a Loglan hangover as others have speculated,
since Loglan does not have "'"; you simply have to know which Loglan
vowel-pairs are diphthongs and which are vowel sequences.

I personally would be quite content if all such "foreign" sequences
were forbidden altogether. Can someone easily check to see whether we
have used them in fu'ivla?

> (2) "Commas are never required: no two Lojban words differ solely
> because of the presence or placement of a comma."

This was done because the contrary rule (as in Loglan) led to
absurdities like ai,ai,aiaglu being different from a,ia,ia,iaglu
(in pre-Lojban Loglan, "aiaiaiaglu" was read as the former, but
in current Loglan it's the latter). This difference
seemed to us to be too subtle, and to threaten audio-visual isomorphism.

> This seems to lead to contradiction. If {kanku,a} can be
> pronounced like {kanku'a} by (1) and the comma is not required
> by (2) so it is equivalent to {kankua}. How many different
> words are among {kanku'a}, {kanku,a} and {kankua}, and which
> are they?

I take the current position to be that "kanku'a" is a lujvo, and
"kankua" and "kanku,a" are different spellings of the same fu'ivla;
I would be in favor of forbidding "kanku,a" altogether. Note that
its stress accent is very different, KANkua vs. kanKU,a, so this
is not just a matter of a glide vs. a full vowel.

--
John Cowan www.ccil.org/~cowan jcowan@reutershealth.com www.reutershealth.com
Monday we watch-a Firefly's house, but he no come out. He wasn't home.
Tuesday we go to the ball game, but he fool us. He no show up. Wednesday he
go to the ball game, and we fool him. We no show up. Thursday was a
double-header. Nobody show up. Friday it rained all day. There was no ball
game, so we stayed home and we listened to it on-a the radio. --Chicolini


Jorge Llamb��)B�as scripsit:

> That's what I thought I remembered. The rule makes some sense for
> the n/r-hyphens, but I don't see the point of it for the y-hyphen.

Optional rules always complicate things for the user.

> One other thing that bothers me in the official definition is
> the restriction for ntc/nts/ndj/ndz in lujvo but not in cmene or
> fuhivla.

If that's so, it's an error in description; the ban on these should be
language-wide, like the ban on "bb" or "pg", and for the same reason:
they threaten audio-visual isomorphism. (Which is not to say that
speakers of some languages can't make clear distinctions in all these
cases.)

I suspect it's specified for lujvo because it came up in the context
of lujvo and its extension to fu'ivla and cmene wasn't thought through.
This dates back to Loglan days.

--
LEAR: Dost thou call me fool, boy? John Cowan
FOOL: All thy other titles http://www.ccil.org/~cowan
thou hast given away: jcowan@reutershealth.com
That thou wast born with. http://www.reutershealth.com


On Friday 17 December 2004 07:41, Jorge "Llambías" wrote:
> One other thing that bothers me in the official definition is
> the restriction for ntc/nts/ndj/ndz in lujvo but not in cmene or
> fuhivla.

As I understood it, they are forbidden in all words. {mk}, on the other hand,
is permitted at the beginning of a cmene, but not a brivla.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


posts: 1912


> > (1) "It is always legal to use the apostrophe (IPA h) sound in
> > pronouncing a comma."
>
> The rationale for this is that "ae"/"a,e" (for example) is a sequence
> that occurs only in "foreign words", and that a "native-speaker" Lojbanist
> should be able to pronounce it with the nearest "native" analogue,
> namely "a'e". This is not a Loglan hangover as others have speculated,
> since Loglan does not have "'"; you simply have to know which Loglan
> vowel-pairs are diphthongs and which are vowel sequences.
>
> I personally would be quite content if all such "foreign" sequences
> were forbidden altogether. Can someone easily check to see whether we
> have used them in fu'ivla?

So the rule you would favor would be something more like:
"The vowel pairs aa, ae, ao, ea, ee, eo, eu, oa, oe, oo, ou
(with or without intervening commas) are equivalent
to a'a, a'e, a'o, e'a, e'e, e'o, e'u, o'a, o'e, o'o, o'u
respectively."

But, for example, ua = u,a is always different from u'a.

> > (2) "Commas are never required: no two Lojban words differ solely
> > because of the presence or placement of a comma."
>
> This was done because the contrary rule (as in Loglan) led to
> absurdities like ai,ai,aiaglu being different from a,ia,ia,iaglu
> (in pre-Lojban Loglan, "aiaiaiaglu" was read as the former, but
> in current Loglan it's the latter). This difference
> seemed to us to be too subtle, and to threaten audio-visual isomorphism.

{aiaiaiaglu} is not currently a valid fu'ivla, because it
doesn't have a consonant cluster in the first five letters.

That rule for fu'ivla is also quite odd. I would find more
reasonable to either impose no restriction on the number of
letters that can precede the cluster, or make the restriction
to be a maximum of two vowels before the cluster (that's what
happens with lujvo). Allowing three vowels but not a consonant
with three vowels is weird, and makes that part of the
algorithm unreasonably complicated.

> I take the current position to be that "kanku'a" is a lujvo, and
> "kankua" and "kanku,a" are different spellings of the same fu'ivla;
> I would be in favor of forbidding "kanku,a" altogether. Note that
> its stress accent is very different, KANkua vs. kanKU,a, so this
> is not just a matter of a glide vs. a full vowel.

So the rule you would want is something like "commas are not
allowed to break what otherwise would be a diphthong in brivla"?

(I think that agrees with what Pierre said re prua/pru,a)

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


Jorge Llamb��)B�as scripsit:

> So the rule you would favor would be something more like:
> "The vowel pairs aa, ae, ao, ea, ee, eo, eu, oa, oe, oo, ou
> (with or without intervening commas) are equivalent
> to a'a, a'e, a'o, e'a, e'e, e'o, e'u, o'a, o'e, o'o, o'u
> respectively."

The rule I'd favor is that all of these (including the ones with
y, like ay and yo) are erroneous, period.

> {aiaiaiaglu} is not currently a valid fu'ivla, because it
> doesn't have a consonant cluster in the first five letters.

Right, but that was not a rule in (at least some versions of) Loglan.

> That rule for fu'ivla is also quite odd. I would find more
> reasonable to either impose no restriction on the number of
> letters that can precede the cluster, or make the restriction
> to be a maximum of two vowels before the cluster (that's what
> happens with lujvo).

I have no idea what the motivation was for this rule; it was already
a given. I would favor no restriction.

> So the rule you would want is something like "commas are not
> allowed to break what otherwise would be a diphthong in brivla"?

I'd favor "Commas are garbage and shouldn't be allowed"; but that
may be too radical. How about "Commas are used to clarify
pronunciation, not to change it."

--
John Cowan jcowan@reutershealth.com http://www.ccil.org/~cowan
O beautiful for patriot's dream that sees beyond the years
Thine alabaster cities gleam undimmed by human tears!
America! America! God mend thine every flaw,
Confirm thy soul in self-control, thy liberty in law!
— one of the verses not usually taught in U.S. schools


posts: 1912


> > That's what I thought I remembered. The rule makes some sense for
> > the n/r-hyphens, but I don't see the point of it for the y-hyphen.
>
> Optional rules always complicate things for the user.

Not always. My impression is that simple algorithms
in general produce easier to learn rules. The algorithm
to forbid some y-hyphens would be rather complicated.
A user should have no trouble in understanding what
{selyma'o} means, and there is no need for them to
ever produce that form. It's more complicated to learn
to recognize {selyma'o} as an error.

> > One other thing that bothers me in the official definition is
> > the restriction for ntc/nts/ndj/ndz in lujvo but not in cmene or
> > fuhivla.
>
> If that's so, it's an error in description; the ban on these should be
> language-wide, like the ban on "bb" or "pg", and for the same reason:
> they threaten audio-visual isomorphism. (Which is not to say that
> speakers of some languages can't make clear distinctions in all these
> cases.)

I probably misinterpreted this:

"Lojbanized names can begin or end with any permissible consonant
pair, not just the 48 initial consonant pairs listed above, and
can have consonant triples in any location, as long as the pairs
making up those triples are permissible."

That would mean the name {santcos} I used in the Quixote translation
ages ago is wrong.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


Jorge Llamb��)B�as scripsit:

> "Lojbanized names can begin or end with any permissible consonant
> pair, not just the 48 initial consonant pairs listed above, and
> can have consonant triples in any location, as long as the pairs
> making up those triples are permissible."

Yes, this should certainly say that the forbidden triples are forbidden
in names as well.

--
Henry S. Thompson said, / "Syntactic, structural, John Cowan
Value constraints we / Express on the fly." jcowan@reutershealth.com
Simon St. Laurent: "Your / Incomprehensible http://www.reutershealth.com
Abracadabralike / schemas must die!" http://www.ccil.org/~cowan


posts: 1912


> > "aa, ae, ao, ea, ee, eo, eu, oa, oe, oo, ou"
>
> The rule I'd favor is that all of these (including the ones with
> y, like ay and yo) are erroneous, period.

I think I will implement that.

> > That rule for fu'ivla is also quite odd. I would find more
> > reasonable to either impose no restriction on the number of
> > letters that can precede the cluster, or make the restriction
> > to be a maximum of two vowels before the cluster (that's what
> > happens with lujvo).
>
> I have no idea what the motivation was for this rule; it was already
> a given. I would favor no restriction.

I will implement that too. It simplifies things.

> > So the rule you would want is something like "commas are not
> > allowed to break what otherwise would be a diphthong in brivla"?
>
> I'd favor "Commas are garbage and shouldn't be allowed"; but that
> may be too radical.

That would be great.

> How about "Commas are used to clarify
> pronunciation, not to change it."

Sounds good.

Should something like {co,i} return an error, or should it be
allowed but parsed just like {coi}?

mu'o mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


posts: 2388

Two minor questions:

Why is {ou} not allowed? I suspect the answer is
that somepeople can't distinguish between it and
{o} but that doesn't seem to be a very good
reason, since it suggests that some people
mispronounce {o} (as I know that Lojbab with his
diminished vowel set does).

Does the restriction on {ntc/nts/ndj/ndz} mean
that NONE of them can occur? This seems
excessive; at most the recogninition patterns
would require that not all of them occur, that
some distinctions are neutralized at this point.
It would seen that, for example, both {ntc} and
{ndz}(except perhaps before {i}) could both
occur.



On Friday 17 December 2004 07:45, John Cowan wrote:
> I personally would be quite content if all such "foreign" sequences
> were forbidden altogether. Can someone easily check to see whether we
> have used them in fu'ivla?

Looking at the fu'ivla in jbovlaste, and not checking whether these were
actually used anywhere, I find the following:
cipnrxuazine
io'imbe
mandioka
spatrleoxari
spatrxapio
stagrleoxari (I have used this one in a recipe)

This last one is pronounced {stagrle,oxari}, which jbovlaste considers to be a
different word.

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


On Friday 17 December 2004 09:53, John E Clifford wrote:
> Does the restriction on {ntc/nts/ndj/ndz} mean
> that NONE of them can occur? This seems
> excessive; at most the recogninition patterns
> would require that not all of them occur, that
> some distinctions are neutralized at this point.
> It would seen that, for example, both {ntc} and
> {ndz}(except perhaps before {i}) could both
> occur.

They are forbidden because they can be confused with {nc/ns/nj/nz}. Spend a
year in a southern city and you'll find out why.

phma
--
GCS/M d- s-: a+ C++ UL++++$ P+ L+++ E- W+++ N+ o? K? w-- O? M- V- Y++
PGP++ t- 5? X? R- !tv b++ DI !D G e++ h+>---- r- y>+++


posts: 1912


> On Friday 17 December 2004 07:45, John Cowan wrote:
> > I personally would be quite content if all such "foreign" sequences
> > were forbidden altogether. Can someone easily check to see whether we
> > have used them in fu'ivla?
>
> Looking at the fu'ivla in jbovlaste, and not checking whether these were
> actually used anywhere, I find the following:
> cipnrxuazine
> io'imbe
> mandioka
> spatrleoxari
> spatrxapio
> stagrleoxari (I have used this one in a recipe)

iV and uV would still be allowed, because they occur in cmavo.
The only ones affected would be spatrleoxari and stagrleoxari,
which could become -lexari, -loxari, -le'oxari, -lioxari or
something else.

mu'o mi'e xorxes






__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


John E Clifford scripsit:

> Why is {ou} not allowed? I suspect the answer is
> that somepeople can't distinguish between it and
> {o} but that doesn't seem to be a very good
> reason, since it suggests that some people
> mispronounce {o} (as I know that Lojbab with his
> diminished vowel set does).

Essentially all Americans pronounce long "o" as in "so" as ou, so we ban it.
(In British English it's @u, more or less "yu" in Lojban orthography.)

> Does the restriction on {ntc/nts/ndj/ndz} mean
> that NONE of them can occur?

Correct. The point is that nc and ntc, ns and nts, ndj and nj, ndz and nz
are too easily confused by anglophones, so we ban the first of each pair.
We probably should have added mps to this list, as illustrated by words
like "Hampshire", "Thompson", "glimpse"; mpz isn't a problem because mz
is already banned for idiosyncratic JCB reasons.

--
Henry S. Thompson said, / "Syntactic, structural, John Cowan
Value constraints we / Express on the fly." jcowan@reutershealth.com
Simon St. Laurent: "Your / Incomprehensible http://www.reutershealth.com
Abracadabralike / schemas must die!" http://www.ccil.org/~cowan


posts: 2388

Yeah; I heard that when I actually said them --
and I am not even in a southern environment
(well, St. Louis is borderline).



> On Friday 17 December 2004 09:53, John E
> Clifford wrote:
> > Does the restriction on {ntc/nts/ndj/ndz}
> mean
> > that NONE of them can occur? This seems
> > excessive; at most the recogninition patterns
> > would require that not all of them occur,
> that
> > some distinctions are neutralized at this
> point.
> > It would seen that, for example, both {ntc}
> and
> > {ndz}(except perhaps before {i}) could both
> > occur.
>
> They are forbidden because they can be confused
> with {nc/ns/nj/nz}. Spend a
> year in a southern city and you'll find out
> why.
>
> phma
> —
> GCS/M d- s-: a+ C++ UL++++$ P+ L+++ E- W+++ N+
> o? K? w-- O? M- V- Y++
> PGP++ t- 5? X? R- !tv b++ DI !D G e++ h+>----
> r- y>+++
>
>
>



Jorge Llamb��)B�as scripsit:

> Should something like {co,i} return an error, or should it be
> allowed but parsed just like {coi}?

An error, I'd say; it's an attempt to change pronunciation.

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
I am he that buries his friends alive and drowns them and draws them
alive again from the water. I came from the end of a bag, but no bag
went over me. I am the friend of bears and the guest of eagles. I am
Ringwinner and Luckwearer; and I am Barrel-rider. --Bilbo to Smaug


posts: 2388



> John E Clifford scripsit:
>
> > Why is {ou} not allowed? I suspect the
> answer is
> > that somepeople can't distinguish between it
> and
> > {o} but that doesn't seem to be a very good
> > reason, since it suggests that some people
> > mispronounce {o} (as I know that Lojbab with
> his
> > diminished vowel set does).
>
> Essentially all Americans pronounce long "o" as
> in "so" as ou, so we ban it.
> (In British English it's @u, more or less
> "yu" in Lojban orthography.)
>
Relevance? Lojban {o} is supposedly the
"Italian," "pure," form. since most Lojbanists
are native speakers of American English (which
doesn't differentiate much on this issue)who
cannot hit that tone, the best solution was and
is to match the corresponding solution for {e},
using the lower, "short," form. I see that CLL
doesn't do that, creating yet another asymmetry
in the phonology and so allowing {ei} but not
{ou} (of course, there is a matching asymmetry in
allowing {oi} but not {eu} and I wouldn't want to
do away with that — even though native speakers
of AE can — and do — produce this in
paralinguistic contexts; disgust being the
typical case). Strictly speaking, as a practical
matter rather than a theoretic one, {ei} ought to
be disallowed as well, since as a matter of fact
it and simple {e} are often confused in even
fairly clear contexts (the fact that we have a
number of word pairs Ce-Cei that go often in the
same places doesn't help, of course). But the
root is again regular mispronunciation of the
vowel (this time in spite of the CLL prescription).


posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> > Should something like {co,i} return an error, or should it be
> > allowed but parsed just like {coi}?
>
> An error, I'd say; it's an attempt to change pronunciation.

Possible comma rules would be:

(1) Allow commas anywhere at all. They don't affect anything.

(2) Allow commas anywhere except in the middle of a diphthong.
So for example the the name {i,ain} is illegal, because
{iain} must parse as {ia,in}.

(3) Allow commas anywhere in names, but nowhere else.

(4) Allow commas only at permissible syllable boundaries.

(5) ...

The problem with (4) is that then we have to define what the
permissible syllable boundaries are. (4) might even be the same
as (2). Is {brod,a} allowed? Is {b,roda} allowed? Is {b,r,o,d,a}
allowed?

Also, should multiple commas be allowed, as in
{b,,r,,o,,d,,a}? They could be used to show
slow and careful pronunciation, for example.

I'm inclined to go with (1) at the moment, as I don't see any
reasonable restriction rule. I don't think we want the comma
rules to be the main part of the morphology algorithm, which
is what would happen if we imposed complicated (and quite
unnecessary) syllable rules.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


On Friday 17 December 2004 12:31, John E Clifford wrote:
> Yeah; I heard that when I actually said them --
> and I am not even in a southern environment
> (well, St. Louis is borderline).

I meant southern Lojbangug ;) actually I meant to say it in Lojban (le nu ko
zvati lo nantca cu nanca) but it came out in English.

phma
--
Now I need a magnifier to find my eyeglasses!
-Les Perles de la médecine


John E Clifford scripsit:

> Relevance? Lojban {o} is supposedly the
> "Italian," "pure," form. since most Lojbanists
> are native speakers of American English (which
> doesn't differentiate much on this issue)who
> cannot hit that tone, the best solution was and
> is to match the corresponding solution for {e},
> using the lower, "short," form. I see that CLL
> doesn't do that, creating yet another asymmetry
> in the phonology

CLL definitely does permit the Italian open "o", also used as the Polish "o".
However, this does not help Americans that much, since the short version
of this (as in "hot", "pot", "top") has been basically eliminated throughout
the U.S. (it persists in Canada), and the long version (as in "awl", "law")
survives only in those born east of a certain line and before a certain date.
The best option for most Americans is to use their native long "o" sound,
and disallow "ou" in Lojban.

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
"It's the old, old story. Droid meets droid. Droid becomes chameleon.
Droid loses chameleon, chameleon becomes blob, droid gets blob back
again. It's a classic tale." --Kryten, Red Dwarf


posts: 1912


I'm implementing stress marking as follows:

1- Commas are ignored always. So for example BRAli,e is identical to BRAlie and is a valid fuhivla.

2- Case of all consonants is ignored. BRoDa = broda

3- Case is ignored in both cmene and cmavo, because stress is irrelevant for them. {PApiPEtis} is a valid cmene and {la'E'Au} is a valid cmavo form.

4- iV, uV, ai, au, ei, oi are the only vowel pairs allowed. Other sequences give "no-lojban-word". Strings like aiaueiaii are allowed as long as every adjacent pair in them is allowed.

5- Vowel strings are broken in pairs from the left for purposes of counting syllables: ai-au-ei-ai-i has five syllables.

6- Stress on a diphthong is shown by capitalizing the first vowel in ai, au, ei, oi, and the second vowel in iV, uV. The other member of the diphthong is treated as a consonant, i.e. its case is ignored. {Ia} is considered an unstressed syllable, just like {Ba}. {iA} is stressed, like {bA}.

7- Words with wrong stress patterns such as {broDA} or {brIvlA} produce "non-lojban-word".

Comments?

mu'o mi'e xorxes

posts: 14214

On Fri, Dec 17, 2004 at 04:47:52PM -0800, wikidiscuss@lojban.org wrote:
> Comments?

I have only one, and it's a meta-comment.

I am pleased beyond words that someone other than me is doing this,
and I'm going to take this opportunity to largely ignore the entire
proceedings. I really don't want to go anywhere near the morphology
if I can help it.

Having said that, I'm going to try to set up the program that builds
my parser to snarf the morphology from that page, so it will get at
least partially tested.

-Robin


I think the defining document for the morphology should be the algorithm, not
the PEG code. (The algorithm is in the valfendi tarball and needs editing for
clarity.) In an algorithm, and in the C code that implements the algorithm,
one can make a copy of a string, modify it in some way, run a test on it,
make another copy, modify it in a different way, and run another test. This
is not so easy to do in PEG. Thus making the parser simultaneously check that
all the y's in a lujvo are valid and that the stress is in the right place
given where the commas are makes it a lot bigger than checking each one
separately.

I'm talking without much knowledge of PEG, so if there is a way to do two or
three tests without multiplying complexity, please let me know.

phma
--
li ze te'a ci vu'u ci bi'e te'a mu du
li ci su'i ze te'a mu bi'e vu'u ci


On Fri, Dec 17, 2004 at 09:08:02PM -0500, Pierre Abbat wrote:
> I think the defining document for the morphology should be the
> algorithm, not the PEG code. (The algorithm is in the valfendi
> tarball and needs editing for clarity.)

No. No. No.

That is unacceptable.

Totally unacceptable.

The English description of the morphology is a cute toy. Not a
formalism. It needs to die. Quickly. The sooner the better. The
world will be a better place the instant every extant copy is
expunged.

> In an algorithm, and in the C code that implements the algorithm,
> one can make a copy of a string, modify it in some way, run a test
> on it, make another copy, modify it in a different way, and run
> another test. This is not so easy to do in PEG. Thus making the
> parser simultaneously check that all the y's in a lujvo are valid
> and that the stress is in the right place given where the commas are
> makes it a lot bigger than checking each one separately.
>
> I'm talking without much knowledge of PEG, so if there is a way to
> do two or three tests without multiplying complexity, please let me
> know.

Parsing Expression Grammars are just another formalism like Context
Free Grammars. Once the grammar is written, there won't be any reason
to write your own morphology parser. You'll take a parser generator
for the desired language, and you'll run it on the grammar. It will
produce code you don't have to test.

If you're considering writing your own parsing code to duplicate the
effect of what is described by the grammar, well, you can entertain
yourself however you please. But don't ask the rest of us to suffer
for it.

--
Jay Kominek <jkominek@miranda.org>


On Friday 17 December 2004 21:22, jkominek@miranda.org wrote:
> Parsing Expression Grammars are just another formalism like Context
> Free Grammars. Once the grammar is written, there won't be any reason
> to write your own morphology parser. You'll take a parser generator
> for the desired language, and you'll run it on the grammar. It will
> produce code you don't have to test.
>
> If you're considering writing your own parsing code to duplicate the
> effect of what is described by the grammar, well, you can entertain
> yourself however you please. But don't ask the rest of us to suffer
> for it.

I've already written valfendi, and I did not do it to entertain myself.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


On Fri, Dec 17, 2004 at 09:41:57PM -0500, Pierre Abbat wrote:
> On Friday 17 December 2004 21:22, jkominek@miranda.org wrote:
> > Parsing Expression Grammars are just another formalism like Context
> > Free Grammars. Once the grammar is written, there won't be any reason
> > to write your own morphology parser. You'll take a parser generator
> > for the desired language, and you'll run it on the grammar. It will
> > produce code you don't have to test.
> >
> > If you're considering writing your own parsing code to duplicate the
> > effect of what is described by the grammar, well, you can entertain
> > yourself however you please. But don't ask the rest of us to suffer
> > for it.

That didn't come out quite the way I wanted, but, shrug.

> I've already written valfendi, and I did not do it to entertain myself.

No, not many people write parsers by hand any more, for fun, or
otherwise. Lets stick with describing the morphology in a fashion
readily fed into parser generators, so that developers can get
guaranteed correctness easily. A English algorithm description
condemns them to duplicating effort for potentially dubious and
difficult to verify results.

(As an aside, I'm already working on a PEG parser generator which
produces C. It mostly works already, and if there is interest, I can
also produce a parser generator/parser combo more suited to
interactive debugging of the grammar.)

--
Jay Kominek <jkominek@miranda.org>


Here's the comment of one of valfendi's functions, isslinkuhi, which is used
in finding the beginning of a brivla (or rejecting a string as not containing
a valid brivla) and in checking for a valid fu'ivla rafsi (according to my
rule, which allows many arbitrarily long fu'ivla to have rafsi). Below is my
attempt at a PEG translation. How is the translation?

phma


/* A slinku'i, as far as word breaking is concerned, is anything that matches
the regex

Craf3*(gim?$|raf4?y)

but does not match the regex

raf3*(gim?$|raf4?y)

where
C matches any consonant
raf3 matches any 3-letter rafsi
raf4 matches any 4-letter rafsi
gim matches any gismu.
Anything after the first 'y' is ignored. It has no effect on where to break
the
word, only on whether the word is valid. */

slinkuhi <- !(3-letter-rafsi* (gismu? space / long-rafsi? y)) consonant
3-letter-rafsi* (gismu? space / long-rafsi? y)

3-letter-rafsi <- CVV-rafsi / CVC-rafsi / CCV-rafsi

--
..i le babzba ba zbasu
lo jbazbabu lo babjba


posts: 14214

On Fri, Dec 17, 2004 at 09:08:02PM -0500, Pierre Abbat wrote:
> I think the defining document for the morphology should be the
> algorithm, not the PEG code.

Over my rotting corpse.

If you want something that is definably unambiguous, but not PEG,
that's negotiable, but the day that the Lojban community votes to
define Lojban with a giant English algorithm description instead of
something provably unambiguous is the day I find something else to
do with my spare time.

-Robin

--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/


On Saturday 18 December 2004 00:53, Robin Lee Powell wrote:
> If you want something that is definably unambiguous, but not PEG,
> that's negotiable, but the day that the Lojban community votes to
> define Lojban with a giant English algorithm description instead of
> something provably unambiguous is the day I find something else to
> do with my spare time.

Is a C program sufficiently unambiguous?

xorxes's grammar tells which kind of word a word is, but it requires the word
to be already delimited with spaces or periods. valfendi does not require
this (except in some places such as the end of cmevla), as long as the stress
is indicated in brivla. Like BRKWORDS.TXT, it is designed to take a speech
stream and break it into words.

The problem I see with implementing this in PEG is that valfendi bites off a
piece by counting syllables after the stress, then checks whether, among
other things, the hyphens are in the right place. Is there a way to check one
PE against the part of a string that matched another PE?

phma
--
Now I need a magnifier to find my eyeglasses!
-Les Perles de la médecine


posts: 1912



> Here's the comment of one of valfendi's functions, isslinkuhi, which is used
> in finding the beginning of a brivla (or rejecting a string as not containing

> a valid brivla) and in checking for a valid fu'ivla rafsi (according to my
> rule, which allows many arbitrarily long fu'ivla to have rafsi).

I have not incorporated the concept of fu'ivla rafsi yet in my PEG, but
I will try to do so once I understand it well. The idea is that a
fu'ivla rafsi can be inserted into a lujvo as long as it can be
separated with y hyphens: {other-rafsi y fu'ivla-rafsi y other-rafsi}
without ambiguities, right?

> Below is my
> attempt at a PEG translation. How is the translation?
>
> phma
> ---
> /* A slinku'i, as far as word breaking is concerned, is anything that matches
> the regex

>
Craf3*(gim?$|raf4?y)

> but does not match the regex

>
raf3*(gim?$|raf4?y)

> where
> C matches any consonant
> raf3 matches any 3-letter rafsi
> raf4 matches any 4-letter rafsi
> gim matches any gismu.
> Anything after the first 'y' is ignored. It has no effect on where to
> break
> the
> word, only on whether the word is valid. */
>
> slinkuhi <- !(3-letter-rafsi* (gismu? space / long-rafsi? y)) consonant
> 3-letter-rafsi* (gismu? space / long-rafsi? y)
>
> 3-letter-rafsi <- CVV-rafsi / CVC-rafsi / CCV-rafsi

I can't really tell if they are equivalent because I'm not very
familiar with C, but it sounds basically right. This is how I handle
slinku'i in my PEG:

fuhivla <- !(consonant lujvo) !(consonant final-rafsi) initial-cluster syllable
fuhivla-tail

Any lujvo have already been absorbed, otherwise you can just add !lujvo
at the beginning.

"!(consonant lujvo) !(consonant final-rafsi)" will reject any string
that consists of a consonant+lujvo or a consonant+final-rafsi
(e.g. slinku'i, spe'a or zbroda}.

mu'o mi'e xorxes





__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 381

In a message dated 2004-12-17 11:42:05 AM Eastern Standard Time,
jcowan@reutershealth.com writes:


> Correct. The point is that nc and ntc, ns and nts, ndj and nj, ndz and nz
> are too easily confused by anglophones, so we ban the first of each pair.
> We probably should have added mps to this list, as illustrated by words
> like "Hampshire", "Thompson", "glimpse"; mpz isn't a problem because mz
> is already banned for idiosyncratic JCB reasons.
>

I suspect this should be "mbz", not "mpz", which is disallowed because of the
different voicing of 'p' and 'z'. "mz" should be allowed, but isn't.

stevo





posts: 1912


> xorxes's grammar tells which kind of word a word is, but it requires the word
> to be already delimited with spaces or periods.

That was the first version. The current version already handles
stress marking with caps.

> The problem I see with implementing this in PEG is that valfendi bites off a
> piece by counting syllables after the stress, then checks whether, among
> other things, the hyphens are in the right place. Is there a way to check one

> PE against the part of a string that matched another PE?

Yes, with "&" and "!".

exp <- &exp1 exp2

will succeed only if exp2 starts with or is the start of exp1

exp <- !exp1 exp2

will succeed only if exp2 doesn't start with nor is the start of exp1

mu'o mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


On Saturday 18 December 2004 08:39, Jorge "Llambías" wrote:
> I have not incorporated the concept of fu'ivla rafsi yet in my PEG, but
> I will try to do so once I understand it well. The idea is that a
> fu'ivla rafsi can be inserted into a lujvo as long as it can be
> separated with y hyphens: {other-rafsi y fu'ivla-rafsi y other-rafsi}
> without ambiguities, right?

That is correct. It also has to be unambiguous at the beginning of a word,
which is what shot down {skalduna}: {le skaldunynai} lexes as
{les-kal-dun-y-nai}.

phma
--
..i le babzba ba zbasu
lo jbazbabu lo babjba


On Saturday 18 December 2004 08:45, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > The problem I see with implementing this in PEG is that valfendi bites
> > off a piece by counting syllables after the stress, then checks whether,
> > among other things, the hyphens are in the right place. Is there a way to
> > check one
> >
> > PE against the part of a string that matched another PE?
>
> Yes, with "&" and "!".
>
> exp <- &exp1 exp2
>
> will succeed only if exp2 starts with or is the start of exp1
>
> exp <- !exp1 exp2
>
> will succeed only if exp2 doesn't start with nor is the start of exp1

But how do you check that exp1 and exp2 are identical?

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


posts: 2388

Do you really mean to say that only a relatively
restricted grpup of Americans say "law" with a
low mid back rounded vowel? Aside from a few
people in a narrow band across the upper south
who add an "r" and a few (I'm not quite sure what
the line is) who collapse "aw" with "ah" — but
generally say it more like "aw" — I can't
remember hearing anyone fail to get this sound
right (and even those cases get it right but
either add to it or change its role in the
overall scheme).



> John E Clifford scripsit:
>
> > Relevance? Lojban {o} is supposedly the
> > "Italian," "pure," form. since most
> Lojbanists
> > are native speakers of American English
> (which
> > doesn't differentiate much on this issue)who
> > cannot hit that tone, the best solution was
> and
> > is to match the corresponding solution for
> {e},
> > using the lower, "short," form. I see that
> CLL
> > doesn't do that, creating yet another
> asymmetry
> > in the phonology
>
> CLL definitely does permit the Italian open
> "o", also used as the Polish "o".
> However, this does not help Americans that
> much, since the short version
> of this (as in "hot", "pot", "top") has been
> basically eliminated throughout
> the U.S. (it persists in Canada), and the long
> version (as in "awl", "law")
> survives only in those born east of a certain
> line and before a certain date.
> The best option for most Americans is to use
> their native long "o" sound,
> and disallow "ou" in Lojban.
>
> —
> John Cowan jcowan@reutershealth.com
> www.reutershealth.com www.ccil.org/~cowan
> "It's the old, old story. Droid meets droid.
> Droid becomes chameleon.
> Droid loses chameleon, chameleon becomes blob,
> droid gets blob back
> again. It's a classic tale." --Kryten, Red
> Dwarf
>
>
>



John E Clifford scripsit:
> Do you really mean to say that only a relatively
> restricted grpup of Americans say "law" with a
> low mid back rounded vowel?

Yes, indeed. Westerners and youngsters have merged /O:/ and /A:/.

> and a few (I'm not quite sure what
> the line is) who collapse "aw" with "ah" — but
> generally say it more like "aw" — I can't
> remember hearing anyone fail to get this sound
> right (and even those cases get it right but
> either add to it or change its role in the
> overall scheme).

By no means "a few", but the growing majority.

--
LEAR: Dost thou call me fool, boy? John Cowan
FOOL: All thy other titles http://www.ccil.org/~cowan
thou hast given away: jcowan@reutershealth.com
That thou wast born with. http://www.reutershealth.com


posts: 2388

I know that Lojbab has this feature but I can't
find anyone else with it, including a fairly
large array of youngsters — from 3 up — and
Arizonians of all ages, ditto New Mexicans,
Californians and Oregonians. What is the source
of your claim?



> John E Clifford scripsit:
> > Do you really mean to say that only a
> relatively
> > restricted grpup of Americans say "law" with
> a
> > low mid back rounded vowel?
>
> Yes, indeed. Westerners and youngsters have
> merged /O:/ and /A:/.
>
> > and a few (I'm not quite sure what
> > the line is) who collapse "aw" with "ah" --
> but
> > generally say it more like "aw" — I can't
> > remember hearing anyone fail to get this
> sound
> > right (and even those cases get it right but
> > either add to it or change its role in the
> > overall scheme).
>
> By no means "a few", but the growing majority.
>
> —
> LEAR: Dost thou call me fool, boy? John
> Cowan
> FOOL: All thy other titles
> http://www.ccil.org/~cowan
> thou hast given away:
> jcowan@reutershealth.com
> That thou wast born with.
> http://www.reutershealth.com
>
>
>



posts: 14214

On Sat, Dec 18, 2004 at 08:13:24AM -0500, Pierre Abbat wrote:
> On Saturday 18 December 2004 00:53, Robin Lee Powell wrote:
> > If you want something that is definably unambiguous, but not
> > PEG, that's negotiable, but the day that the Lojban community
> > votes to define Lojban with a giant English algorithm
> > description instead of something provably unambiguous is the day
> > I find something else to do with my spare time.
>
> Is a C program sufficiently unambiguous?

Absolutely not. It's not a formalism, it's a piece of code.

-Robin


On Saturday 18 December 2004 18:52, Robin Lee Powell wrote:
> On Sat, Dec 18, 2004 at 08:13:24AM -0500, Pierre Abbat wrote:
> > Is a C program sufficiently unambiguous?
>
> Absolutely not. It's not a formalism, it's a piece of code.

Is there a formalism that I can translate valfendi into?

The problem appears to be that you and I think differently. When I wrote
valfendi, I separated, as well as I could, the operation of splitting a
stream of phonemes into words from the operation of determining whether those
words are valid. This is easier for me to understand. To do that, I have to
find the string that matches one regular expression (or parsing expression or
whatever) at the beginning of the remaining text, then check whether that
string, no more and no less, matches another expression. For instance, if the
text is /dAmymlongEnavau/, the first expression matches /dAmymlongEna/, which
the second does not match, although it does match /dAmymlo/. I could write a
PEG with two expressions, called lex-brivla and valid-brivla, and then write
"brivla <- &lex-brivla valid-brivla", but that would match /dAmymlongEnavau/
and consume /dAmymlo/, even though lex-brivla matched /dAmymlongEna/. Trying
to do both checks at once is confusing to me, though you seem to understand
it.

So, is there something sufficiently programming-language-like that I can check
that it's doing the same as valfendi, and sufficiently formal that you can
check that it's doing the same as the PEG?

phma
--
My monthly periods happen once per year.
-Les Perles de la médecine


posts: 14214

On Sat, Dec 18, 2004 at 03:52:24PM -0800, Robin Lee Powell wrote:
> On Sat, Dec 18, 2004 at 08:13:24AM -0500, Pierre Abbat wrote:
> > On Saturday 18 December 2004 00:53, Robin Lee Powell wrote:
> > > If you want something that is definably unambiguous, but not
> > > PEG, that's negotiable, but the day that the Lojban community
> > > votes to define Lojban with a giant English algorithm
> > > description instead of something provably unambiguous is the
> > > day I find something else to do with my spare time.
> >
> > Is a C program sufficiently unambiguous?
>
> Absolutely not. It's not a formalism, it's a piece of code.

This is slightly innacurate: a C program written to *very* strictly
act like a PDA (push down automaton) or FSM (finite state machine)
would be acceptable, in that I could turn it in to a real formalism
easily.

-Robin


posts: 14214

On Sat, Dec 18, 2004 at 08:01:57PM -0500, Pierre Abbat wrote:
> On Saturday 18 December 2004 18:52, Robin Lee Powell wrote:
> > On Sat, Dec 18, 2004 at 08:13:24AM -0500, Pierre Abbat wrote:
> > > Is a C program sufficiently unambiguous?
> >
> > Absolutely not. It's not a formalism, it's a piece of code.
>
> Is there a formalism that I can translate valfendi into?

I don't know what formalisms you know, so I can't really answer
that.

> The problem appears to be that you and I think differently.

Yes. I tried to explain this to you when I was asking you question
about the valfendi algorithm some months ago. I gave up after a
while; I couldn't figure out what you were talking about.

> When I wrote valfendi, I separated, as well as I could, the
> operation of splitting a stream of phonemes into words from the
> operation of determining whether those words are valid. This is
> easier for me to understand. To do that, I have to find the string
> that matches one regular expression (or parsing expression or
> whatever) at the beginning of the remaining text, then check
> whether that string, no more and no less, matches another
> expression.

That is a fundamentally algorithmic way of thinking, yes.

> For instance, if the text is /dAmymlongEnavau/, the first
> expression matches /dAmymlongEna/, which the second does not
> match, although it does match /dAmymlo/. I could write a PEG with
> two expressions, called lex-brivla and valid-brivla, and then
> write "brivla <- &lex-brivla valid-brivla", but that would match
> /dAmymlongEnavau/ and consume /dAmymlo/, even though lex-brivla
> matched /dAmymlongEna/. Trying to do both checks at once is
> confusing to me, though you seem to understand it.

You've taken brivla out of context, so it's hard to go from what you
just said to something useful, but I assume that the top level is
something like:

morphology <- word*

word <- cmavo / brivla / cmene

Then with your brivla, it only consumes dAmymlo, but that's fine
because the next run of "word" will start at "ng" (which will
probably cause breakage, but that's fine I hope). I have no idea if
this helps you; as with the algorithm itself, I'm not certain I
actually have any real idea what you're asking.

One way to do what you want, I suppose, would be to add another
stage to the parser. Right now, it's got a morphology stage and a
grammar stage. You could break morphology into word-grouping and
word-recognition stages. I have not the slightest idea why this
approach is useful however.

> So, is there something sufficiently programming-language-like that
> I can check that it's doing the same as valfendi,

See, umm, that's impossible. There is no way, in general, to
compare to non-trivial programs to show that they are doing the same
thing. This would be the whole reason I like formalisms.

-Robin


posts: 14214

On Sat, Dec 18, 2004 at 08:53:57AM -0500, Pierre Abbat wrote:
> On Saturday 18 December 2004 08:45, Jorge "Llamb?as" wrote:
> > --- Pierre Abbat wrote:
> > > The problem I see with implementing this in PEG is that
> > > valfendi bites off a piece by counting syllables after the
> > > stress, then checks whether, among other things, the hyphens
> > > are in the right place. Is there a way to check one
> > >
> > > PE against the part of a string that matched another PE?
> >
> > Yes, with "&" and "!".
> >
> > exp <- &exp1 exp2
> >
> > will succeed only if exp2 starts with or is the start of exp1
> >
> > exp <- !exp1 exp2
> >
> > will succeed only if exp2 doesn't start with nor is the start of
> > exp1
>
> But how do you check that exp1 and exp2 are identical?

By "identical" I assume you mean "consume exactly the same input".

Within the strict formalism, I'm not sure you can. I will ponder
this. You can add code to do it, but that rather defeats the
purpose (the only place this trick is used in the current grammar is
with zoi, where it is unavoidable).

I suggested a workaround in another mail, however.

-Robin


posts: 14214

On Sat, Dec 18, 2004 at 05:39:07AM -0800, Jorge Llamb?as wrote:
> --- Pierre Abbat wrote:
> > Below is my
> > attempt at a PEG translation. How is the translation?
>
> I can't really tell if they are equivalent because I'm not very
> familiar with C,

Just for the record, there was no C there, just some regular
expressions.

-Robin


John E Clifford scripsit:
> I know that Lojbab has this feature but I can't
> find anyone else with it, including a fairly
> large array of youngsters — from 3 up — and
> Arizonians of all ages, ditto New Mexicans,
> Californians and Oregonians. What is the source
> of your claim?

It's a well-known fact. http://cla.calpoly.edu/~jrubba/phon/ipafaq.html
is one source picked at random;
http://itre.cis.upenn.edu/~myl/languagelog/archives/000836.html is another.

--
Ambassador Trentino: I've said enough. I'm a man of few words.
Rufus T. Firefly: I'm a man of one word: scram!
--Duck Soup John Cowan <jcowan@reutershealth.com>


posts: 2388

Very interesting though hardly evidence that most
people or even most westerners or youths lack
"aw." In fact, it seems that most have it, even
if only paralinguistically. Oddly — from the
point of view of the claims given — the only
place where I have heard the collapse regularly
is in the extreme Northeast, Maine, where for
example, "John" (general American "jahn") is
"jawn" or even "jawuhn" (the latter to give
length to a normally short vowel, I suspect). Of
course, here the collapse goes the opposite way
from the American norm, presumably influenced by
the Canadian pattern — which does seem to be
pretty general in Ontario and the Maritimes
(though not in BC and the flyover provinces).
None of this seems to me a good case for ignoring
"aw" as a preferred pronunciation for Lojban {o},
which was the point here.



> John E Clifford scripsit:
> > I know that Lojbab has this feature but I
> can't
> > find anyone else with it, including a fairly
> > large array of youngsters — from 3 up — and
> > Arizonians of all ages, ditto New Mexicans,
> > Californians and Oregonians. What is the
> source
> > of your claim?
>
> It's a well-known fact.
> http://cla.calpoly.edu/~jrubba/phon/ipafaq.html
> is one source picked at random;
>
http://itre.cis.upenn.edu/~myl/languagelog/archives/000836.html
> is another.
>
> —
> Ambassador Trentino: I've said enough. I'm a
> man of few words.
> Rufus T. Firefly: I'm a man of one word: scram!
> --Duck Soup John
> Cowan <jcowan@reutershealth.com>
>
>
>



posts: 1912


I have now added handling of rafsi fuhivla, so except for some minor adjustments I will probably have to do, the morphology PEG is basically ready. Anyone wants to test it?

mu'o mi'e xorxes

posts: 14214

On Sun, Dec 19, 2004 at 02:33:25PM -0800, wikidiscuss@lojban.org
wrote:
> Re: PEG Morphology Algorithm
>
> I have now added handling of rafsi fuhivla, so except for some
> minor adjustments I will probably have to do, the morphology PEG
> is basically ready. Anyone wants to test it?

I'll see what I can do. Pierre, can you point me to your latest
test suite again?

Which reminds me: I want to make it clear that I'm very happy about
Pierre's work on valfendi. I don't think it's the right approach
for language definitional purposes (and I told him that right when
he started), but that doesn't mean it's worthless, and I don't want
to give the impression that I think that.

Pierre's work on valfendi serves two important purposes: it built up
a reservoir of expertise in him on the topic, and it got us
something to test *against*. Having valfendi's output to compare my
own to means we can debug both methods better.

So, Pierre, thank you. And hook me up with those test cases,
please.

-Robin

--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/


On Sunday 19 December 2004 20:01, Robin Lee Powell wrote:
> On Sun, Dec 19, 2004 at 02:33:25PM -0800, wikidiscuss@lojban.org
>
> wrote:
> > Re: PEG Morphology Algorithm
> >
> > I have now added handling of rafsi fuhivla, so except for some
> > minor adjustments I will probably have to do, the morphology PEG
> > is basically ready. Anyone wants to test it?
>
> I'll see what I can do. Pierre, can you point me to your latest
> test suite again?

http://phma.hn.org/Language/valfendi.html
There are currently two known bugs: it calls {pru,a} valid but rejects {prua}
(I say they're both invalid), and checking of 'y' in lujvo is broken if the
"-r" option is specified. I've fixed the second and found what to fix in the
first, and I've added a few test cases to the file in the tarball. The new
version should be out in a few days.

> Which reminds me: I want to make it clear that I'm very happy about
> Pierre's work on valfendi. I don't think it's the right approach
> for language definitional purposes (and I told him that right when
> he started), but that doesn't mean it's worthless, and I don't want
> to give the impression that I think that.
>
> Pierre's work on valfendi serves two important purposes: it built up
> a reservoir of expertise in him on the topic, and it got us
> something to test *against*. Having valfendi's output to compare my
> own to means we can debug both methods better.
>
> So, Pierre, thank you. And hook me up with those test cases,
> please.

Glad to hear that. I'll change the blurb at the top of the algorithm.

phma
--
My monthly periods happen once per year.
-Les Perles de la médecine


posts: 14214

On Sun, Dec 19, 2004 at 08:23:02PM -0500, Pierre Abbat wrote:
> On Sunday 19 December 2004 20:01, Robin Lee Powell wrote:
> > On Sun, Dec 19, 2004 at 02:33:25PM -0800, wikidiscuss@lojban.org
> >
> > wrote:
> > > Re: PEG Morphology Algorithm
> > >
> > > I have now added handling of rafsi fuhivla, so except for some
> > > minor adjustments I will probably have to do, the morphology
> > > PEG is basically ready. Anyone wants to test it?
> >
> > I'll see what I can do. Pierre, can you point me to your latest
> > test suite again?
>
> http://phma.hn.org/Language/valfendi.html

OK, there's a testdata.txt file there, but no indications as to how
the various lines should come out (i.e. which ones are valid and
which aren't and so on). Is that data around somewhere?

-Robin


On Friday 17 December 2004 07:45, John Cowan wrote:
> I personally would be quite content if all such "foreign" sequences
> were forbidden altogether. Can someone easily check to see whether we
> have used them in fu'ivla?

Checking the IRC log, I find the following:
{cafnee}: typo for {cafne}.
{skamrmouse}: I'd go with the pronunciation and change it to {skamrmause} if I
used that fu'ivla.
{xukrteobromino}: Turkeys are not made of chocolate, so this should be
{xumrteobromino}, which still has "eo" in it.
Checking jboske, I find nothing.

In case anyone else wants to check more data, the command I used is
tr \ \\n |sort -u |egrep -i '(aa|ae|ao|ea|ee|eo|eu|oa|oe|oo|ou)'|less
followed by the same with commas inserted.

phma
--
Maintenant, j'ai besoin d'une loupe pour trouver mes lunettes!
-Les Perles de la médecine


On Sunday 19 December 2004 20:32, Robin Lee Powell wrote:
> OK, there's a testdata.txt file there, but no indications as to how
> the various lines should come out (i.e. which ones are valid and
> which aren't and so on). Is that data around somewhere?

No, because that depends on the options. I'll send you the output with the
options "-als", which I think corresponds to what xorxes is doing.

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine


posts: 14214

On Sun, Dec 19, 2004 at 09:06:13PM -0500, Pierre Abbat wrote:
> On Sunday 19 December 2004 20:32, Robin Lee Powell wrote:
> > OK, there's a testdata.txt file there, but no indications as to
> > how the various lines should come out (i.e. which ones are valid
> > and which aren't and so on). Is that data around somewhere?
>
> No, because that depends on the options. I'll send you the output
> with the options "-als", which I think corresponds to what xorxes
> is doing.

I can generate the *output* myself, but the fact that the output is
a certain way doesn't mean that it's *right*. A test suite should
be marked up to indicate that a human has reviewed it and that
result X is expected.

-Robin



posts: 14214

One thing that will definately need to get changed is that "y"-as-space and "y bu" both need to get handled in the morphology section. I'll be putting up a new version soon, though.

-Robin

On Sunday 19 December 2004 21:10, Robin Lee Powell wrote:
> I can generate the *output* myself, but the fact that the output is
> a certain way doesn't mean that it's *right*. A test suite should
> be marked up to indicate that a human has reviewed it and that
> result X is expected.

I review it before each release. Should I have someone else review it too? Is
Nora available for this?

phma
--
le xruki le ginxre xrixruba xu xrula cu xrani?


posts: 14214

On Sun, Dec 19, 2004 at 09:23:38PM -0500, Pierre Abbat wrote:
> On Sunday 19 December 2004 21:10, Robin Lee Powell wrote:
> > I can generate the *output* myself, but the fact that the output
> > is a certain way doesn't mean that it's *right*. A test suite
> > should be marked up to indicate that a human has reviewed it and
> > that result X is expected.
>
> I review it before each release.

....

You seem to be missing my point.

The fact that you reviewed it does not provide me with said review
unless you wrote it down somewhere.

I'm trying to generate automated tests here. I can't do that
without knowing what a human thinks the expected results should be.

See
http://en.wikipedia.org/wiki/Automated_testing and
http://en.wikipedia.org/wiki/Regression_testing.

When you said you had a bunch of test cases, I assumed that you had
the (human-) expected results too, otherwise it's not of much use.

-Robin



Pierre Abbat scripsit:

> {xukrteobromino}: Turkeys are not made of chocolate, so this should be
> {xumrteobromino}, which still has "eo" in it.

Well, I don't know the context, but there are chocolate candies in
the shape of a turkey: a hollow chocolate shell covered with aluminum
foil colored to look like a turkey.

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
Rather than making ill-conceived suggestions for improvement based on
uninformed guesses about established conventions in a field of study with
which familiarity is limited, it is sometimes better to stick to merely
observing the usage and listening to the explanations offered, inserting
only questions as needed to fill in gaps in understanding. --Peter Constable


posts: 14214

Woohoo!

The morphology on this page is now auto-snarfed and compiled in with the rest of my grammar, and (after I tweaked some things) actually works.

It's a beauty to see it run. Which reminds me: xorxes, why haven't you installed it? It's just a jar file, should run on any computer made in the last 10 years.

First bug: it things "mim" is "mi" (cmavo) + "m" (cmene).

Oh, and "y" handling isn't fixed.

-Robin

posts: 14214

Bug number 2:

Morphology pass: text=( CMAVO=( SU=( s=( s ) u=( u ) ) ) nonMorphLojbanMorphWord=( 'i ) )

-Robin

posts: 1912

Robin:
> It's a beauty to see it run. Which reminds me: xorxes, why haven't you
> installed it? It's just a jar file, should run on any computer made in the
> last 10 years.

It's probably trivial, but I wouldn't know where to begin.
I don't really know what "jar file" means.

> First bug: it things "mim" is "mi" (cmavo) + "m" (cmene).

Probably fixed now. I changed cmavo to:

cmavo <- !cmene !gismu !lujvo !fuhivla consonant? vowels

I didn't need the restrictions without the selmaho sorting, because
cmavo was tested last:

word <- cmene / gismu / lujvo / fuhivla / cmavo / non-lojban-word

> Oh, and "y" handling isn't fixed.

I think:

BY <- Y spaces? BU / &cmavo (j o h o / r u h o / ...

Or we don't allow spaces in {ybu}?

mu'o mi'e xorxes



__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


posts: 1912


> Re: PEG Morphology Algorithm
> Bug number 2:
>
> Morphology pass: text=( CMAVO=( SU=( s=( s ) u=( u ) ) )
> nonMorphLojbanMorphWord=( 'i ) )

That should be fixed now.

I added "&(space / consonant)" at the end of the cmavo rule.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 14214

This:

cmavo

posts: 14214

On Mon, Dec 20, 2004 at 12:55:54PM -0800, wikidiscuss@lojban.org wrote:
> Re: PEG Morphology Algorithm
> This:
>
> cmavo

Boy, that sure failed spectacularily.

This:

cmavo <- !cmene !gismu !lujvo !fuhivla consonant? vowels &(spaces / consonant)

can't work, because at least one of those ! productions has stuff
out front that calls cmavo (or !cmavo, or whatever). One of them
I've found, which is the lujvo !tosmabru test, but there's at least
one other. Having cmavo call cmavo to match cmavo is left
recursion, and is bad.

-Robin


posts: 1912



> This:
>
> cmavo <- !cmene !gismu !lujvo !fuhivla consonant? vowels &(spaces /
> consonant)
>
> can't work, because at least one of those ! productions has stuff
> out front that calls cmavo (or !cmavo, or whatever). One of them
> I've found, which is the lujvo !tosmabru test, but there's at least
> one other.

fuhivla-head started with &cmavo

> Having cmavo call cmavo to match cmavo is left
> recursion, and is bad.

Changed to:

cmavo <- !cmene !gismu !lujvo !fuhivla cmavo-form

cmavo-form <- consonant? vowels &(spaces / consonant)

lujvo and fuhivla now use cmavo-form.

mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 14214

On Mon, Dec 20, 2004 at 04:49:20AM -0800, Jorge Llamb?as wrote:
> Robin:
> > It's a beauty to see it run. Which reminds me: xorxes, why
> > haven't you installed it? It's just a jar file, should run on
> > any computer made in the last 10 years.
>
> It's probably trivial, but I wouldn't know where to begin. I don't
> really know what "jar file" means.

"Java Archive".

I want to mention in passing that talking to someone who could write
that much of a grammar without testing it is like talking to, I
dunno, *Einstein* or something. If the singularity comes, I may one
day be as smart as you, but not otherwise.

Anyways, assuming you're on windows, Start -> Run "cmd". Enter
"java". If it says command not found, go to
http://www.java.com/en/download/manual.jsp

It it simply hangs there until you hit ^c, we're good to go. Run
something like:

java -jar lojban_peg_parser.jar test.txt

to process the stuff it test.txt

> > First bug: it things "mim" is "mi" (cmavo) + "m" (cmene).
>
> Probably fixed now.

Indeed.

Currently, zoi is broken and using zei causes a crash (!).

Working on it.

-Robin


posts: 14214

On Mon, Dec 20, 2004 at 01:33:34PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > This:
> >
> > cmavo <- !cmene !gismu !lujvo !fuhivla consonant? vowels
> > &(spaces / consonant)
> >
> > can't work, because at least one of those ! productions has
> > stuff out front that calls cmavo (or !cmavo, or whatever). One
> > of them I've found, which is the lujvo !tosmabru test, but
> > there's at least one other.
>
> fuhivla-head started with &cmavo
>
> > Having cmavo call cmavo to match cmavo is left recursion, and is
> > bad.
>
> Changed to:
>
> cmavo <- !cmene !gismu !lujvo !fuhivla cmavo-form
>
> cmavo-form <- consonant? vowels &(spaces / consonant)
>
> lujvo and fuhivla now use cmavo-form.

A couple more places needed to use cmavo-form, but it seems to work
now.

-Robin


posts: 1912


> I want to mention in passing that talking to someone who could write
> that much of a grammar without testing it is like talking to, I
> dunno, *Einstein* or something. If the singularity comes, I may one
> day be as smart as you, but not otherwise.

u'i ki'e

> java -jar lojban_peg_parser.jar test.txt
>
> to process the stuff it test.txt

That worked, thank you. I found a bug for the lujvo rule with
fuhivla rafsi: it accepted fuhivla that start with a vowel as
non-initial rafsi, I think that's fixed now. Is it too
complicated for me to generate lojban_peg_parser.jar with
a modified grammar?

mu'o mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 14214

On Mon, Dec 20, 2004 at 05:14:17PM -0800, Jorge Llamb?as wrote:
> --- Robin Lee Powell wrote:
> > java -jar lojban_peg_parser.jar test.txt
> >
> > to process the stuff it test.txt
>
> That worked, thank you.

w00t

> I found a bug for the lujvo rule with fuhivla rafsi: it accepted
> fuhivla that start with a vowel as non-initial rafsi, I think
> that's fixed now. Is it too complicated for me to generate
> lojban_peg_parser.jar with a modified grammar?

Very, very much too complicated. It's actually just one command,
but you need to be me to run it. :-)

We stomped on each other, but I believe I've fixed it, and a new
version is up and seems to be working excellently. I've put in the
y+ mod, btw.

Check that your stuff got in.

-Robin



posts: 14214

Bug?:

Morphology pass: text=( nonLojbanWord=( KREFU ) )

I'm assuming that caps are always equivalent, as that's what the CLL
seems to say, so that should just be {krefu}, yes?

-Robin


posts: 14214

It Would Be Nice for digits to be treated as PA, but that's not
currently happening and I don't want to fix it right now.

Morphology pass: text=( nonLojbanWord=( 123 ) )

-Robin


posts: 14214

Everybody (i.e. the other two parsers) but us likes:

tci'ile
and

tci'ilykemcantutra

I'm marking this NOT SURE in test_sentences.txt. If xorxes and/or
Pierre could review the NOT SURE lines in that file for what they
should actually be, that would be good.

-Robin


On Tuesday 21 December 2004 03:53, Robin Lee Powell wrote:
> Everybody (i.e. the other two parsers) but us likes:
>
> tci'ile
> and
>
> tci'ilykemcantutra
>
> I'm marking this NOT SURE in test_sentences.txt. If xorxes and/or
> Pierre could review the NOT SURE lines in that file for what they
> should actually be, that would be good.

Both of those are valid words. Where do I find test_sentences.txt?

phma
--
li fi'u vu'u fi'u fi'u du li pa


posts: 1912



> Bug?:
>
> Morphology pass: text=( nonLojbanWord=( KREFU ) )
>
> I'm assuming that caps are always equivalent, as that's what the CLL
> seems to say, so that should just be {krefu}, yes?

{krefu} can't have the last syllable stressed, so {KREFU}
or {krEfU} is not a lojban word. Any variant without U
should be accepted: {KREFu}, {KREfu}, {KReFu }, {KrEFu},
{kREFu}, {KRefu }, {KrEfu}, {kREfu}, {KreFu }, {kReFu },
{krEFu}, {Krefu }, {kRefu }, {krEfu}, {kreFu }, {krefu }.

I suppose we could make a rule that if only caps are used
throughout the text, they are treated as lower case. Not
sure how to implement it though.

mu'o mi'e xorxes




__
Do you Yahoo!?
Jazz up your holiday email with celebrity designs. Learn more.
http://celebrity.mail.yahoo.com


posts: 1912



> It Would Be Nice for digits to be treated as PA, but that's not
> currently happening and I don't want to fix it right now.
>
> Morphology pass: text=( nonLojbanWord=( 123 ) )

I think this is all it takes:

cmavo <- !cmene !gismu !lujvo !fuhivla cmavo-form / digit

I made the change, and added comma* in front of the digit
definition, so that commas are allowed everywhere without
disrupting anything.

I thought about replacing:

stressed <- comma* AEIOU

with:

stressed <- comma* áéíóú

and making the corresponding changes in the letter definitions,
and adding:

letter <- comma* A-Z

BY <- Y space-chars* BU / &cmavo ( j o h o / ... / x y / z y / letter )
&(spaces / consonant)

but I guess that would be too revolutionary for some people.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 1912



> Everybody (i.e. the other two parsers) but us likes:
>
> tci'ile

I don't understand why this is not accepted.

fuhivla <- !cmene !gismu !lujvo (stressed-fuhivla-head cluster fuhivla-tail /
fuhivla-head cluster stressed-fuhivla-tail)

cmene !gismu !lujvo should be satisfied.

stressed-fuhivla-head starts with &cmavo-form, so that is not satisfied.

fuhivla-head <- !slinkuhi &initial-cluster / &cmavo-form syllable (!consonant
syllable)* &non-initial-cluster

slinkuhi <- consonant medial-rafsi* final-rafsi

slinkuhi is satisfied, because {ci'i} is a medial-rafsi but {le}

is not a final-rafsi.
&initial-cluster is satisfied.

Then an empty fuhivla-head should be acceptable. Is this a problem?

cluster absorbs {tc}.

stressed-fuhivla-tail <- syllable syllable+ &spaces / syllable*
stressed-syllable syllable &(spaces / consonant)

syllable absorbs {i}
syllable+ absorbs {'i} and {le}
&spaces is satisfied.

So why is {tci'ile} not accepted?

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 14214

On Tue, Dec 21, 2004 at 04:41:17AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpowell@digitalkingdom.org> wrote:
>
> > It Would Be Nice for digits to be treated as PA, but that's not
> > currently happening and I don't want to fix it right now.
> >
> > Morphology pass: text=( nonLojbanWord=( 123 ) )
>
> I think this is all it takes:
>
> cmavo <- !cmene !gismu !lujvo !fuhivla cmavo-form / digit

No, they need to be PA, not just cmavo:

Morphology pass: text=( CMAVO=( cmavo=( digit=( 1 ) ) )
CMAVO=( cmavo=( digit=( 2 ) ) ) CMAVO=( PA=( digit=( 3 ) )
) )

> I thought about replacing:
>
> stressed <- comma* AEIOU
>
> with:
>
> stressed <- comma* ?????
>
> and making the corresponding changes in the letter definitions,
> and adding:
>
> letter <- comma* A-Z
>
> BY <- Y space-chars* BU / &cmavo ( j o h o / ... / x y / z y / letter )
> &(spaces / consonant)
>
> but I guess that would be too revolutionary for some people.

I don't understand what this would do?

-Robin


posts: 1912


> On Tue, Dec 21, 2004 at 04:41:17AM -0800, Jorge Llamb?as wrote:
> > I think this is all it takes:
> >
> > cmavo <- !cmene !gismu !lujvo !fuhivla cmavo-form / digit
>
> No, they need to be PA, not just cmavo:
>
> Morphology pass: text=( CMAVO=( cmavo=( digit=( 1 ) ) )
> CMAVO=( cmavo=( digit=( 2 ) ) ) CMAVO=( PA=( digit=( 3 ) )
> ) )

Right, the problem is that PA is followed by space or consonant,
not by digit, so PA should end &(space / consonant / digit).
Probably every &(space / consonant) should be changed to that.

post-word <- &(space / consonant / digit)

> > I thought about replacing:
> >
> > stressed <- comma* AEIOU
> >
> > with:
> >
> > stressed <- comma* ?????
> >
> > and making the corresponding changes in the letter definitions,
> > and adding:
> >
> > letter <- comma* A-Z
> >
> > BY <- Y space-chars* BU / &cmavo ( j o h o / ... / x y / z y / letter )
> > &(spaces / consonant)
> >
> > but I guess that would be too revolutionary for some people.
>
> I don't understand what this would do?

Use caps to represent lerfu, and use an acute mark
on vowels to represent stress.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912


I changed cmene-syllaboid to:

cmene-syllaboid

posts: 1912



> Re: PEG Morphology Algorithm
>
> I changed cmene-syllaboid to:
>
> cmene-syllaboid
>
>

(Posting from the discussion forum doesn't like the "<-".)

I changed cmene-syllaboid to:

cmene-syllaboid <- !doi-la-lai-lahi consonant* vowels / digit

so that things like {la 2005nan.} are allowed.

mu'o mi'e xorxes









__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 14214

On Tue, Dec 21, 2004 at 08:11:07AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > On Tue, Dec 21, 2004 at 04:41:17AM -0800, Jorge Llamb?as wrote:
> > > I think this is all it takes:
> > >
> > > cmavo <- !cmene !gismu !lujvo !fuhivla cmavo-form / digit
> >
> > No, they need to be PA, not just cmavo:
> >
> > Morphology pass: text=( CMAVO=( cmavo=( digit=( 1 ) ) )
> > CMAVO=( cmavo=( digit=( 2 ) ) ) CMAVO=( PA=( digit=( 3 )
> > )) )
>
> Right, the problem is that PA is followed by space or consonant,
> not by digit, so PA should end &(space / consonant / digit).
> Probably every &(space / consonant) should be changed to that.
>
> post-word <- &(space / consonant / digit)

Can't do that, sorry. A non-terminal must contain at least one
non-& and non-! element. Removed the & from post-word, changed all
calls to it to be &post-word.

> > > I thought about replacing:
> > >
> > > stressed <- comma* AEIOU
> > >
> > > with:
> > >
> > > stressed <- comma* ?????
> > >
> > > and making the corresponding changes in the letter
> > > definitions, and adding:
> > >
> > > letter <- comma* A-Z
> > >
> > > BY <- Y space-chars* BU / &cmavo ( j o h o / ... / x y / z y
> > > / letter ) &(spaces / consonant)
> > >
> > > but I guess that would be too revolutionary for some people.
> >
> > I don't understand what this would do?
>
> Use caps to represent lerfu, and use an acute mark on vowels to
> represent stress.

Aaah. The acute marks come through as ?, which should explain well
enough why I oppose this. :-)

I'm sure I could figure out how to view them properly, but that's
not the point: until nothing but X (where X is probably Unicode) is
the sole accepted option for all computer-based text, we need to
stick to ascii.

-Robin


posts: 14214

On Tue, Dec 21, 2004 at 04:22:32AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > Bug?:
> >
> > Morphology pass: text=( nonLojbanWord=( KREFU ) )
> >
> > I'm assuming that caps are always equivalent, as that's what the
> > CLL seems to say, so that should just be {krefu}, yes?
>
> {krefu} can't have the last syllable stressed, so {KREFU} or
> {krEfU} is not a lojban word. Any variant without U should be
> accepted: {KREFu}, {KREfu}, {KReFu }, {KrEFu}, {kREFu}, {KRefu },
> {KrEfu}, {kREfu}, {KreFu }, {kReFu }, {krEFu}, {Krefu }, {kRefu },
> {krEfu}, {kreFu }, {krefu }.

text
selbri3
|- BRIVLA
| gismu: KREFu
|- BRIVLA
| gismu: KREfu
|- BRIVLA
| gismu: KReFu
|- BRIVLA
| gismu: KrEFu
|- BRIVLA
| gismu: kREFu
|- BRIVLA
| gismu: KRefu
|- BRIVLA
| gismu: KrEfu
|- BRIVLA
| gismu: kREfu
|- BRIVLA
| gismu: KreFu
|- BRIVLA
| gismu: kReFu
|- BRIVLA
| gismu: krEFu
|- BRIVLA
| gismu: Krefu
|- BRIVLA
| gismu: kRefu
|- BRIVLA
| gismu: krEfu
|- BRIVLA
| gismu: kreFu
|- BRIVLA
gismu: krefu

..u'i sai

> I suppose we could make a rule that if only caps are used
> throughout the text, they are treated as lower case. Not sure how
> to implement it though.

Nah, let's not bother. (although it is from Alice, IIRC, so you may
want to look at fixing it).

-Robin


posts: 14214

On Tue, Dec 21, 2004 at 05:32:55AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > Everybody (i.e. the other two parsers) but us likes:
> >
> > tci'ile
>
> I don't understand why this is not accepted.
>
> fuhivla <- !cmene !gismu !lujvo (stressed-fuhivla-head cluster
> fuhivla-tail / fuhivla-head cluster stressed-fuhivla-tail)
>
> !cmene !gismu !lujvo should be satisfied. stressed-fuhivla-head
> starts with &cmavo-form, so that is not satisfied.
>
> fuhivla-head <- !slinkuhi &initial-cluster / &cmavo-form syllable
> (!consonant syllable)* &non-initial-cluster
>
> slinkuhi <- consonant medial-rafsi* final-rafsi
>
> !slinkuhi is satisfied, because {ci'i} is a medial-rafsi but {le}
> is not a final-rafsi.

I have checked thus far and believe I agree.

> &initial-cluster is satisfied.

Heh. No, it's not.

nitial-cluster <- initial-consonant+ consonant !consonant

{t} and {c} are both eaten by initial-consonant, leaving nothing for
consonant.

You just got bitten by greedy absorption. Welcome to my world. :-)

Fixed:

text
BRIVLA
fuhivla
|- cluster
| |- consonant
| | unvoicedConsonant
| | t
| | t
| |- consonant
| unvoicedConsonant
| c
| c
|- stressedFuhivlaTail
|- syllable
| syllableCore
| vowel
| vowelY
| i
| i
|- syllable
| |- h: '
| |- syllableCore
| vowel
| vowelY
| i
| i
|- syllable
|- consonant
| l
| l
|- syllableCore
vowel
vowelY
e
e

Isn't it pretty?

The solution is:

initial-cluster <- (initial-consonant &consonant)+ consonant !consonant

-Robin


posts: 14214

On Tue, Dec 21, 2004 at 05:53:01AM -0500, Pierre Abbat wrote:
> On Tuesday 21 December 2004 03:53, Robin Lee Powell wrote:
> > Everybody (i.e. the other two parsers) but us likes:
> >
> > tci'ile and
> >
> > tci'ilykemcantutra
> >
> > I'm marking this NOT SURE in test_sentences.txt. If xorxes
> > and/or Pierre could review the NOT SURE lines in that file for
> > what they should actually be, that would be good.
>
> Both of those are valid words.

And both are now recognized as such.

> Where do I find test_sentences.txt?

http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/test_sentences.txt

You already went through some of it at one point, but I don't have
it hand. It would probably be best if you posted here, so we could
fight about it. However, note that I haven't finished marking it up
yet. I'll post here when I do.

Oh, and I need to roll valfendi into my testing.

-Robin


posts: 1912



> You just got bitten by greedy absorption. Welcome to my world. :-)

Aaahhrgghhh!!

> The solution is:
>
> initial-cluster <- (initial-consonant &consonant)+ consonant !consonant

Or how about:

initial-cluster <- initial-consonant+ consonant? !consonant

mi'e xorxes




__
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail.yahoo.com


posts: 14214

On Tue, Dec 21, 2004 at 10:08:31AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > You just got bitten by greedy absorption. Welcome to my world.
> > :-)
>
> Aaahhrgghhh!!
>
> > The solution is:
> >
> > initial-cluster <- (initial-consonant &consonant)+ consonant
> > !consonant
>
> Or how about:
>
> initial-cluster <- initial-consonant+ consonant? !consonant

Nope. That will match {t} alone. What you're trying to do there
is:
initial-cluster <- initial-consonant initial-consonant+
consonant? !consonant / initial-consonant+
consonant !consonant

which seems needlessly complicated.

-Robin


posts: 1912


> On Tue, Dec 21, 2004 at 10:08:31AM -0800, Jorge Llamb?as wrote:
> > Or how about:
> >
> > initial-cluster <- initial-consonant+ consonant? !consonant
>
> Nope. That will match {t} alone.

How come? Both {t} and {c} are valid initial-consonant so they
should both be grabbed by initial-consonant+
No consonant is found next, so consonant? grabs nothing.
And finally !consonant is satisfied because what follows is {i}.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com


posts: 14214

Our parser (can't call it "my parser" anymore, really) is the only
one that believes that {neal} and {daed} are not valid cmene. Bug
or feature?

-Robin



posts: 14214

On Tue, Dec 21, 2004 at 10:17:12AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > On Tue, Dec 21, 2004 at 10:08:31AM -0800, Jorge Llamb?as wrote:
> > > Or how about:
> > >
> > > initial-cluster <- initial-consonant+ consonant? !consonant
> >
> > Nope. That will match {t} alone.
>
> How come? Both {t} and {c} are valid initial-consonant so they
> should both be grabbed by initial-consonant+
> No consonant is found next, so consonant? grabs nothing.
> And finally !consonant is satisfied because what follows is {i}.

You misunderstand. That will just as easily match {t} in {ti'ile}
as {tc} in {tci'ile}. {t} is "one or more" initial consonants.

-Robin


posts: 1912



> Our parser (can't call it "my parser" anymore, really) is the only
> one that believes that {neal} and {daed} are not valid cmene. Bug
> or feature?

It was done on purpose: no non-permissible consonant or vowel
pairs are allowed anywhere in lojban words, including cmene.
The apostrophe is only allowed between vowels.

I guess those rules could be relaxed, but if we do, we should
do it for cmavo and for fuhivla as well as for cmene.

Some restrictions might still still be missing for digits, though.
I think {1'an} will be accepted.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 1912



> You misunderstand. That will just as easily match {t} in {ti'ile}
> as {tc} in {tci'ile}. {t} is "one or more" initial consonants.

Oops, right.

mu'o mi'e xorxes





__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 14214

On Tue, Dec 21, 2004 at 10:27:48AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpowell@digitalkingdom.org> wrote:
>
> > Our parser (can't call it "my parser" anymore, really) is the
> > only one that believes that {neal} and {daed} are not valid
> > cmene. Bug or feature?
>
> It was done on purpose:

That's all I needed.

> no non-permissible consonant or vowel pairs are allowed anywhere
> in lojban words,

I didn't realize {ae} and {ea} counted, hence my asking.

> I guess those rules could be relaxed,

No, that's fine; this is all English contamination in IRC anyways.

> Some restrictions might still still be missing for digits, though.
> I think {1'an} will be accepted.

Nope:

Morphology pass: text=( CMAVO=( cmavo=( digit=( 1 ) ) )
nonLojbanWord=( 'an ) )

-Robin


posts: 14214

Morphology pass: text=( nonLojbanWord=( pravda ) )

Bug or feature?

-Robin


posts: 14214

The "commas anywhere" thing allows the camxes parser to accept
things like "2, by tirno", which seems to maybe not be what was
intended. :-)

Not sure if this should be fixed, but thought it was worth
mentioning.

-Robin


posts: 1912



> Morphology pass: text=( nonLojbanWord=( pravda ) )
>
> Bug or feature?

Feature: it fails slinku'i
{le pravda} could be the lujvo lep-ravda.


mi'e xorxes



__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 14214

On Tue, Dec 21, 2004 at 10:51:08AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > Morphology pass: text=( nonLojbanWord=( pravda ) )
> >
> > Bug or feature?
>
> Feature: it fails slinku'i {le pravda} could be the lujvo
> lep-ravda.

Even if there's a pause in there? Seems the slinku'i test is a bit
overzealous.

-Robin


posts: 1912



> > Feature: it fails slinku'i {le pravda} could be the lujvo
> > lep-ravda.
>
> Even if there's a pause in there? Seems the slinku'i test is a bit
> overzealous.

We could allow {.slinku'i} as a fu'ivla, but then it would
be the only type of brivla that must begin with a pause.

mu'o mi'e xorxes






__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 14214

On Tue, Dec 21, 2004 at 11:04:25AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > > Feature: it fails slinku'i {le pravda} could be the lujvo
> > > lep-ravda.
> >
> > Even if there's a pause in there? Seems the slinku'i test is a
> > bit overzealous.
>
> We could allow {.slinku'i} as a fu'ivla, but then it would be the
> only type of brivla that must begin with a pause.

Ewww. Nevermind, forget I said anything.

-Robin


posts: 953

On Tue, 21 Dec 2004, Robin Lee Powell wrote:

>>> Morphology pass: text=( nonLojbanWord=( pravda ) )
>>
>> Feature: it fails slinku'i {le pravda} could be the lujvo
>> lep-ravda.
>
> Even if there's a pause in there? Seems the slinku'i test is a bit
> overzealous.

The slinku'i test AFAIK applies to single words, not to strings of words.

If we were to allow this, we would have to *enforce* a pause in front of
it, which we never otherwise do for consonant-initial brivla.

Isn't that too much of a privilege to give to a lowly fu'ivla?

--
Arnt Richard Johansen http://arj.nvg.org/
The problem is, witchcraft is not fantasy; it is a sinful reality in
our world. --christiananswers.net


posts: 1912


Every fu'ivla that starts with a consonant can be used
as the final rafsi of a lujvo.

Given that {i} is permissible after any vowel, and that {iy}
is a valid vowel pair, we could give every fuhivla that starts
with a consonant a medial rafsi if we use -iy- as the hyphen.
(This could be in addition to the priviledged fuhivla that
have shorter rafsi, so {tci'ile} for example would have both
tci'ily- and tci'ileiy- as rafsi, just as {valsi} has valsy- and
val-. This would be easy to implement.

fuhivla that start with a vowel still need to start with a pause,
so they can form lujvo non-initially only with zei.

mu'o mi'e xorxes

Robin Lee Powell scripsit:
> Our parser (can't call it "my parser" anymore, really) is the only
> one that believes that {neal} and {daed} are not valid cmene. Bug
> or feature?

Feature, he said firmly. The old (can't really call it "official any more")
parser is completely clueless about morphology: all it can do is break
up compound cmavo.

--
"No, John. I want formats that are actually John Cowan
useful, rather than over-featured megaliths that http://www.ccil.org/~cowan
address all questions by piling on ridiculous http://www.reutershealth.com
internal links in forms which are hideously jcowan@reutershealth.com
over-complex." --Simon St. Laurent on xml-dev


posts: 953

On Tue, 21 Dec 2004 wikidiscuss@lojban.org wrote:

> Re: PEG Morphology Algorithm
>
> Every fu'ivla that starts with a consonant can be used
> as the final rafsi of a lujvo.

I suppose this is a proposed change.

Apart from my general reluctance to fix something that is not broken
(which all of you are probably aware of by now) a question:

How can you tell the difference between a lujvo with a final rafsi
fu'ivla, and a stage-4 fu'ivla that just happens to have a lujvolike form
in the source language?

--
Arnt Richard Johansen http://arj.nvg.org/
<Nixon> XP kjennes ... sprengt.
<Nixon> Som om noe har eksplodert der.


On Tuesday 21 December 2004 13:31, Robin Lee Powell wrote:
> On Tue, Dec 21, 2004 at 10:27:48AM -0800, Jorge Llamb?as wrote:
> > --- Robin Lee Powell <rlpowell@digitalkingdom.org> wrote:
> > > Our parser (can't call it "my parser" anymore, really) is the
> > > only one that believes that {neal} and {daed} are not valid
> > > cmene. Bug or feature?
> >
> > It was done on purpose:
>
> That's all I needed.
>
> > no non-permissible consonant or vowel pairs are allowed anywhere
> > in lojban words,
>
> I didn't realize {ae} and {ea} counted, hence my asking.
>
> > I guess those rules could be relaxed,
>
> No, that's fine; this is all English contamination in IRC anyways.

For fu'ivla, the Book has the example {kuln,r,kore,a}, and I've found
{xumrte,obromino} in IRC and used {stagrle,oxari} in a recipe. Cmene can also
have "iy" and "uy". If you want to insist that non-diphthong vowel pairs have
commas, that's fine, but they are allowed.

phma

--
..i le babzba ba zbasu
lo jbazbabu lo babjba


posts: 1912


> >
> > Every fu'ivla that starts with a consonant can be used
> > as the final rafsi of a lujvo.
>
> I suppose this is a proposed change.

It's in CLL, at least in embryonic form. Pierre worked it out in
more detail.

> Apart from my general reluctance to fix something that is not broken
> (which all of you are probably aware of by now) a question:
>
> How can you tell the difference between a lujvo with a final rafsi
> fu'ivla, and a stage-4 fu'ivla that just happens to have a lujvolike form
> in the source language?

A fu'ivla-rafsi must always be separated with "y" from any other rafsi.

If the thing between two y's (or between the start of the word and y,
or between y and the endo of the word) is a string of normal rafsi, then
it has to be a string of normal rafsi and cannot be a fu'ivla-rafsi.

If the thing between y's is not a string of normal rafsi, and when
adding a vowel is a fu'ivla, then it is a fu'ivla rafsi.

If the thing between y's is neither a string of rafsi nor a
fu'ivla-rafsi, then we have a non-lojban-word.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com


On Tuesday 21 December 2004 15:40, wikidiscuss@lojban.org wrote:
> Re: PEG Morphology Algorithm
>
> Every fu'ivla that starts with a consonant can be used
> as the final rafsi of a lujvo.
>
> Given that {i} is permissible after any vowel, and that {iy}
> is a valid vowel pair, we could give every fuhivla that starts
> with a consonant a medial rafsi if we use -iy- as the hyphen.
> (This could be in addition to the priviledged fuhivla that
> have shorter rafsi, so {tci'ile} for example would have both
> tci'ily- and tci'ileiy- as rafsi, just as {valsi} has valsy- and
> val-. This would be easy to implement.
>
> fuhivla that start with a vowel still need to start with a pause,
> so they can form lujvo non-initially only with zei.

The way I set up valfendi, there is one class of fu'ivla that have rafsi that
can be used anywhere in a lujvo, and all others cannot be used in a lujvo
except with zei. "-iy-" was tried and discarded a long time ago.

phma
--
S Fa1>+/- !TM M-- K H T-- t? AT++ SY Te- SC- FO- D P !Tz E++ L


On Tuesday 21 December 2004 16:33, Arnt Richard Johansen wrote:
> On Tue, 21 Dec 2004 wikidiscuss@lojban.org wrote:
> > Re: PEG Morphology Algorithm
> >
> > Every fu'ivla that starts with a consonant can be used
> > as the final rafsi of a lujvo.
>
> I suppose this is a proposed change.
>
> Apart from my general reluctance to fix something that is not broken
> (which all of you are probably aware of by now) a question:
>
> How can you tell the difference between a lujvo with a final rafsi
> fu'ivla, and a stage-4 fu'ivla that just happens to have a lujvolike form
> in the source language?

A lujvo that ends with a rafsi fu'ivla always has 'y' before it (e.g.
{nalytci'ile}; a fu'ivla never has 'y' in it.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


posts: 1912


> For fu'ivla, the Book has the example {kuln,r,kore,a}, and I've found
> {xumrte,obromino} in IRC and used {stagrle,oxari} in a recipe. Cmene can also
>
> have "iy" and "uy". If you want to insist that non-diphthong vowel pairs have
>
> commas, that's fine, but they are allowed.

I don't have a problem with allowing non-diphthong vowel pairs in cmene
as long as they are allowed in cmavo and fu'ivla as well. All we need to
do in order to allow them is eliminate the !a !e !o !u !y at the end
of the vowel rules. The syllable counting rules required for fu'ivla
are not affected by this.

The current PEG allows iy and uy in cmene and cmavo.

mu'o mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 1912


> The way I set up valfendi, there is one class of fu'ivla that have rafsi that

> can be used anywhere in a lujvo, and all others cannot be used in a lujvo
> except with zei.

Why do you disallow fu'ivla that start with a consonant as final rafsi?

> "-iy-" was tried and discarded a long time ago.

I remember reading something about it. How was it tried, and why
was it discarded?

mu'o mi'e xorxes





__
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail.yahoo.com


posts: 10

Jorge, this is just marvelous work — I'm in awe. (I'm also envious of the amount of free time you appear to have. :-) However, I have a concern about the overall approach you're taking — the high-level design, as it were.

The grammar in its current state does four separable things:
1. It partitions the input stream into words.
2. It validates the words, rejecting invalid vowel and consonant patterns.
3. It determines the selma'o of a cmavo.
4. It categorizes brivla into gismu, lujvo and fu'ivla.

As a result, the grammar is fearsomely complex in spots. (OK, the part that recognizes selma'o isn't complex; it's just huge.) And it could be argued that categorizing brivla really belongs to semantic analysis, not parsing.

For the sake of modularity and reducing point-complexity, I think it would be worth considering splitting the job into its components, and writing separate grammars:
1. A partitioning grammar that considers an input string, and accepts a word (cmene, brivla, cmavo or non-Lojban) from its head.
2. A validating grammar that considers a Lojban word, and rejects it (re-categorizing it as non-Lojban?) if it has invalid vowel or consonant patterns.
3. Selma'o determination might be more easily described as a symbol table lookup than as a parsing problem.
4. A grammar that considers a valid Lojban brivla, and categorizes it.

Of course this scheme depends on being able to combine multiple PEG-generated parsers into a single program. But if the parser generator takes parameters which can be used to name the input and parser functions, that shouldn't be hard.

Or is there already a consensus that the requirement is for a single grand grammar covering every relevant aspect of the language?

Clark Nelson

posts: 14214

On Tue, Dec 21, 2004 at 04:27:01PM -0800, wikidiscuss@lojban.org
wrote:
> Jorge, this is just marvelous work — I'm in awe. (I'm also
> envious of the amount of free time you appear to have. :-)
> However, I have a concern about the overall approach you're taking
> — the high-level design, as it were.

Actually, the high-level design is mine, not his. See:

http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/

> The grammar in its current state does four separable things:

Just because they *can* be seperated, doesn't mean they should be.

> 1. It partitions the input stream into words.
>
> 2. It validates the words, rejecting invalid vowel and consonant patterns.
>
> 3. It determines the selma'o of a cmavo.
>
> 4. It categorizes brivla into gismu, lujvo and fu'ivla.

In fact, these are not seperate actions, so far as I know, in either
jbofihe or the current official parser.

I don't consider step 2 to be distinct from step 4, by the way.

> As a result, the grammar is fearsomely complex in spots. (OK, the
> part that recognizes selma'o isn't complex; it's just huge.)

Yup. You should see the version in the main grammar.

> And it could be argued that categorizing brivla really belongs to
> semantic analysis, not parsing.

Umm, what?

> For the sake of modularity and reducing point-complexity, I think
> it would be worth considering splitting the job into its
> components, and writing separate grammars:

The problem with this is that we could argue for hours over where
the seperations lie. I was vehemently opposed to seperating out the
morphology from the rest of the grammar in the first place, in fact.

> Of course this scheme depends on being able to combine multiple
> PEG-generated parsers into a single program.

Already done. What you're describing might result in a noticeable
slowdown in processing, but I can't be sure.

> But if the parser generator takes parameters which can be used to
> name the input and parser functions, that shouldn't be hard.

It's a pain in the ass, but it's not hard.

> Or is there already a consensus that the requirement is for a
> single grand grammar covering every relevant aspect of the
> language?

As I said, the grammar is already in two parts: morphology and
syntax. The only reason I agreed to that, however, is that it was
pointed out that other, completely different, morphologies might
want to be used, and that that should be allowed for.

-Robin


posts: 14214

Morphology pass: text=( CMAVO=( cmavo=( cuu ) ) )

I assume this is a bug.

-Robin


On Tuesday 21 December 2004 19:05, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > The way I set up valfendi, there is one class of fu'ivla that have rafsi
> > that
> >
> > can be used anywhere in a lujvo, and all others cannot be used in a lujvo
> > except with zei.
>
> Why do you disallow fu'ivla that start with a consonant as final rafsi?

I don't. Why do you think I do? {nalytci'ile} is valid but {nalyskalduna} is
not.

> > "-iy-" was tried and discarded a long time ago.
>
> I remember reading something about it. How was it tried, and why
> was it discarded?

That was before my time.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


posts: 10

>> The grammar in its current state does four separable things:
>
> Just because they *can* be seperated, doesn't mean they should be.

No, of course not.

> In fact, these are not seperate actions, so far as I know, in either
> jbofihe or the current official parser.

And just because they have heretofore been unified, doesn't mean that they
should be, either.

>> And it could be argued that categorizing brivla really belongs to
>> semantic analysis, not parsing.
>
> Umm, what?

Simple: for parsing purposes, a brivla is a brivla is a brivla. It's only
when you get around to trying to figure out the meaning of a sentence that
it begins to matter how it was formed, from which one can determine what it
means.

>> For the sake of modularity and reducing point-complexity, I think
>> it would be worth considering splitting the job into its
>> components, and writing separate grammars:
>
> The problem with this is that we could argue for hours over where
> the seperations lie. I was vehemently opposed to seperating out the
> morphology from the rest of the grammar in the first place, in fact.

Well, of course if one (very influential) partipant is "vehemently opposed"
to any separation, then any proposal for separation would necessarily either
be rejected immediately, or result in hours of argument. :-)

>> Of course this scheme depends on being able to combine multiple
>> PEG-generated parsers into a single program.
>
> Already done. What you're describing might result in a noticeable
> slowdown in processing, but I can't be sure.

It might also result in a noticeable speedup. Just for example, with the
current grammar for determining selma'o, validation would be done twice:
once when &cmavo is evaluated, and again when each of the letters is
scanned, because of all the lookahead involved in all the single-letter
rules.

>> Or is there already a consensus that the requirement is for a
>> single grand grammar covering every relevant aspect of the
>> language?
>
> As I said, the grammar is already in two parts: morphology and
> syntax. The only reason I agreed to that, however, is that it was
> pointed out that other, completely different, morphologies might
> want to be used, and that that should be allowed for.

Like I say, I believe that partitioning, validation and characterization are
probably simpler considered separately than together. It takes a genius of
Jorge's caliber to write or understand a parser that does all three
simultaneously. I strongly suspect that if separate grammars were used to
solve pieces of the whole problem, each would be simple enough that many,
many more people would be able to understand them. Ideally, they would be
simple enough that it would be feasible to see whether the grammar(s) do
what the prose description says.

Clark



On Tuesday 21 December 2004 19:47, Robin Lee Powell wrote:
> On Tue, Dec 21, 2004 at 04:27:01PM -0800, wikidiscuss@lojban.org
>
> wrote:
> > Jorge, this is just marvelous work — I'm in awe. (I'm also
> > envious of the amount of free time you appear to have. :-)
> > However, I have a concern about the overall approach you're taking
> > — the high-level design, as it were.
>
> Actually, the high-level design is mine, not his. See:
>
> http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/
>
> > The grammar in its current state does four separable things:
>
> Just because they *can* be seperated, doesn't mean they should be.

They are separated in valfendi (except it doesn't do 3).

> > 1. It partitions the input stream into words.
> >
> > 2. It validates the words, rejecting invalid vowel and consonant
> > patterns.
> >
> > 3. It determines the selma'o of a cmavo.
> >
> > 4. It categorizes brivla into gismu, lujvo and fu'ivla.
>
> > For the sake of modularity and reducing point-complexity, I think
> > it would be worth considering splitting the job into its
> > components, and writing separate grammars:
>
> The problem with this is that we could argue for hours over where
> the seperations lie. I was vehemently opposed to seperating out the
> morphology from the rest of the grammar in the first place, in fact.

The problem with doing it in PEG is that it appears to be impossible to check
that a string matches two different PEs with the same number of characters
matched for both. That's why every selma'o PE ends with checking for a space
or consonant, even though "cmavo" already checked for that.

phma
--
li fi'u vu'u fi'u fi'u du li pa


posts: 14214

On Tue, Dec 21, 2004 at 08:36:31PM -0500, Pierre Abbat wrote:
> The problem with doing it in PEG is that it appears to be
> impossible to check that a string matches two different PEs with
> the same number of characters matched for both. That's why every
> selma'o PE ends with checking for a space or consonant, even
> though "cmavo" already checked for that.

1. I have no idea what that has to do with the rest of the
conversation.

2. So what? It works quite well, you'll notice.

-Robin


posts: 10

> The problem with doing it in PEG is that it appears to be impossible to
> check
> that a string matches two different PEs with the same number of characters
> matched for both. That's why every selma'o PE ends with checking for a
> space
> or consonant, even though "cmavo" already checked for that.

Actually, that's the problem with doing it *in a single PEG grammar*, which
is what I'm suggesting that we not do.

Clark



posts: 953

On Tue, 21 Dec 2004 wikidiscuss@lojban.org wrote:

> Re: PEG Morphology Algorithm — design
> The grammar in its current state does four separable things:

> 1. It partitions the input stream into words.
...
> 4. It categorizes brivla into gismu, lujvo and fu'ivla.

I believe that it is possible that these two tasks are not separable. I=
n=20
any case, the current approach of the morphology part of does it in a w=
ay=20
consistent with the traditional (not fully operationalized) method of=20
determining which words are of what kind.

Basically, a fu'ivla is any word that fits the definition of a brivla=20
(consonant cluster in first five letters, not counting y or '), but is =
not=20
either a gismu or a lujvo. So a fu'ivla is a very open-ended set of=20
words. When cmavo are preceding a fu'ivla, there are some potential=20
ambiguities that we have to handle. This is done via the so-called=20
"slinku'i test", which is explained at:

http://www.lojban.org/tiki/tiki-index.php?page=3Dslinku%27i

In order to do the slinku'i test, we have to know what a lujvo is like.=
To=20
know what a lujvo is like, we have to know what a rafsi is like. Final=20
rafsi can be gismu, so we have to match against that, too. So, only to=20
separate words consistently in the face of fu'ivla, we have to implemen=
t=20
all of these concepts. So I believe further modularization is not=20
possible.

--=20
Arnt Richard Johansen http://arj.nvg.org=
/
=ABN=E5r jeg kommer til kloakken, er det for =E5 rense opp - n=E5r Zola=
bes=F8ker det
samme sted, er det for =E5 bade!=BB --Henrik Ibsen


posts: 1912


> The grammar in its current state does four separable things:
> 1. It partitions the input stream into words.
> 2. It validates the words, rejecting invalid vowel and consonant patterns.
> 3. It determines the selma'o of a cmavo.
> 4. It categorizes brivla into gismu, lujvo and fu'ivla.
>
> As a result, the grammar is fearsomely complex in spots.

Yes. Unfortunately, this is unavoidable. Lojban morphology is an
ugly monster, that's a fact.

It was me who asked Robin to separate the morphology from the main syntax
part of the grammar. The determination of selmaho is not part of what I
did, and I agree it belongs in a separate module, but the way it is written
now, you can ignore the selmaho part and it works with just "words" at the
highest level.

1, 2 and 4 are inextricably linked. You can't do one without the other.

>(OK, the part that
> recognizes selma'o isn't complex; it's just huge.) And it could be argued
> that categorizing brivla really belongs to semantic analysis, not parsing.

You can't detect valid brivla without categorizing it. Brivla is a
collection of gismu, lujvo and fuhivla rether than these being a
partition of an initial class brivla, as it were.

> For the sake of modularity and reducing point-complexity, I think it would be
> worth considering splitting the job into its components, and writing separate
> grammars:
> 1. A partitioning grammar that considers an input string, and accepts a word
> (cmene, brivla, cmavo or non-Lojban) from its head.
> 2. A validating grammar that considers a Lojban word, and rejects it
> (re-categorizing it as non-Lojban?) if it has invalid vowel or consonant
> patterns.
> 3. Selma'o determination might be more easily described as a symbol table
> lookup than as a parsing problem.
> 4. A grammar that considers a valid Lojban brivla, and categorizes it.

I tried to make the morphology as modular as possible. Validation of
consonant and vowel pairs is done at the lowest level.

Then each word class has its own module. You can't put all brivla in
a single module. You could say that a brivla is any string that ends
in a vowel and whose second consonant is part of a cluster, but then
you'd be letting in some cmavo+brivla combinations and also some
invalid stuff. It doesn't really advance you much.

> Of course this scheme depends on being able to combine multiple PEG-generated
> parsers into a single program. But if the parser generator takes parameters
> which can be used to name the input and parser functions, that shouldn't be
> hard.

I wouldn't know anything about that. The separation can be done within
a single grammar, by making a section take the output of a lower section
as its "pseudo-terminals". That's not the problem. The problem is the
inherent comnplexity of the grammar itself. (Indeed, when I asked Robin
to separate the morphology part this is all I had in mind.)

> Or is there already a consensus that the requirement is for a single grand
> grammar covering every relevant aspect of the language?

Not from my part. I want as much modularity as possible.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


posts: 1912



> Morphology pass: text=( CMAVO=( cmavo=( cuu ) ) )
>
> I assume this is a bug.

Any valid vowel pair is accepted in cmavo:
{.uau}, {miau}, {cuu}, {kiy}, etc.

I don't want to restrict them in cmavo unless they are equally restricted
in cmene and fu'ivla.

mu'o mi'e xorxes




__
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail.yahoo.com


posts: 1912



> On Tuesday 21 December 2004 19:05, Jorge "Llambías" wrote:
> > --- Pierre Abbat wrote:
> > > The way I set up valfendi, there is one class of fu'ivla that have rafsi
> > > that
> > >
> > > can be used anywhere in a lujvo, and all others cannot be used in a lujvo
> > > except with zei.
> >
> > Why do you disallow fu'ivla that start with a consonant as final rafsi?
>
> I don't. Why do you think I do? {nalytci'ile} is valid but {nalyskalduna} is
> not.

I meant: Why do you disallow *some* fu'ivla that start with a consonant
as final rafsi. Any fu'ivla that starts with a consonant can be used
equally unambiguously as a final rafsi.

> > > "-iy-" was tried and discarded a long time ago.
> >
> > I remember reading something about it. How was it tried, and why
> > was it discarded?
>
> That was before my time.

Well, mine too. Now is our time, so let's consider it. {iy} puts every
fu'ivla (at least those that start with a consonant) on an equal footing.
Just as with gismu, some are priviledged with shorter rafsi, but all
have at least the long ones. Why not extend the same benefit to fu'ivla?

mu'o mi'e xorxes




__
Do you Yahoo!?
Jazz up your holiday email with celebrity designs. Learn more.
http://celebrity.mail.yahoo.com


posts: 1912


> Like I say, I believe that partitioning, validation and characterization are
> probably simpler considered separately than together. It takes a genius of
> Jorge's caliber to write or understand a parser that does all three
> simultaneously.

Thank you for the compliment, but in fact what I tried to do is
to separate them as much as I could.

> I strongly suspect that if separate grammars were used to
> solve pieces of the whole problem, each would be simple enough that many,
> many more people would be able to understand them.

The cmene, gismu and cmavo rules are very easy to understand, I would say.

The lujvo rule is somewhat complicated by the stress rules. I plan
to do a separate parallel grammar that does not handle capital
letters when this one is done, which I believe is much easier to
read. Other than that, the lujvo section is long but straightforward.

fu'ivla is probably the trickiest part to figure out.

> Ideally, they would be
> simple enough that it would be feasible to see whether the grammar(s) do
> what the prose description says.

Indeed, that's the goal. That's one reason for using terms
like "slinkuhi" and "tosmabru" for rule names for example, because
they perform the slinkuhi and tosmabru tests respectively.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 1912


> Basically, a fu'ivla is any word that fits the definition of a brivla=20
> (consonant cluster in first five letters, not counting y or '),

(With John Cowan's consent) I extended the definition of brivla
to "second consonant belongs to a cluster" rather than "cluster
in first five letters". The restriction in the number of leading
vowels is not well motivated. The number five just comes from the
particular restrictions on the length of vowel strings of gismu
and lujvo, but fu'ivla need not be so restricted.

> In order to do the slinku'i test, we have to know what a lujvo is like.=
> To=20
> know what a lujvo is like, we have to know what a rafsi is like. Final=20
> rafsi can be gismu, so we have to match against that, too. So, only to=20
> separate words consistently in the face of fu'ivla, we have to implemen=
> t=20
> all of these concepts. So I believe further modularization is not=20
> possible.

Indeed.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


On Tuesday 21 December 2004 20:46, Robin Lee Powell wrote:
> On Tue, Dec 21, 2004 at 08:36:31PM -0500, Pierre Abbat wrote:
> > The problem with doing it in PEG is that it appears to be
> > impossible to check that a string matches two different PEs with
> > the same number of characters matched for both. That's why every
> > selma'o PE ends with checking for a space or consonant, even
> > though "cmavo" already checked for that.
>
> 1. I have no idea what that has to do with the rest of the
> conversation.
>
> 2. So what? It works quite well, you'll notice.

In cmavo, yes. But when you consider lujvo and fu'ivla, especially lujvo with
fu'ivla rafsi in them, you have to check whether the y-hyphens are necessary
and whether something can be decomposed into rafsi with or without a
consonant at the front removed, and the end of the brivla is marked, not by
something simple such as a consonant, but by the stress, with an unstressed
syllable (or two, if there's a 'y' present) after the stressed one.

phma
--
Now I need a magnifier to find my eyeglasses!
-Les Perles de la médecine


posts: 10

From: "Jorge Llambías" <jjllambias2000@yahoo.com.ar>
> --- Arnt Richard Johansen wrote:
>> Basically, a fu'ivla is any word that fits the definition of a brivla=20
>> (consonant cluster in first five letters, not counting y or '),
>
> (With John Cowan's consent) I extended the definition of brivla
> to "second consonant belongs to a cluster" rather than "cluster
> in first five letters". The restriction in the number of leading
> vowels is not well motivated. The number five just comes from the
> particular restrictions on the length of vowel strings of gismu
> and lujvo, but fu'ivla need not be so restricted.
>
>> In order to do the slinku'i test, we have to know what a lujvo is like.=
>> To=20
>> know what a lujvo is like, we have to know what a rafsi is like. Final=20
>> rafsi can be gismu, so we have to match against that, too. So, only to=20
>> separate words consistently in the face of fu'ivla, we have to implemen=
>> t=20
>> all of these concepts. So I believe further modularization is not=20
>> possible.
>
> Indeed.

Hmm. Consider the following passage from CLL (4.3):

All brivla have the following properties:
1) always end in a vowel;
2) always contain a consonant pair in the first five letters, where "y" and
apostrophe are not counted as letters for this purpose;
3) always are stressed on the next-to-last (penultimate) syllable; this
implies that they have two or more syllables.

I always assumed this to be definitive, rather than descriptive: that any
word having all these characteristics is defined as being a brivla; not that
this happens to be true as a consequence of other rules. I also believed
that any brivla that didn't match the pattern of a gismu or lujvo was
defined to be a fu'ivla.

Are these things true or not?

Clark



On Wednesday 22 December 2004 00:24, Clark & Janiece Nelson wrote:
> Hmm. Consider the following passage from CLL (4.3):
>
> All brivla have the following properties:
> 1) always end in a vowel;
> 2) always contain a consonant pair in the first five letters, where "y" and
> apostrophe are not counted as letters for this purpose;
> 3) always are stressed on the next-to-last (penultimate) syllable; this
> implies that they have two or more syllables.
>
> I always assumed this to be definitive, rather than descriptive: that any
> word having all these characteristics is defined as being a brivla; not
> that this happens to be true as a consequence of other rules. I also
> believed that any brivla that didn't match the pattern of a gismu or lujvo
> was defined to be a fu'ivla.
>
> Are these things true or not?

There are three kinds of lerpoi that satisfy those properties but aren't
brivla: tosmabru, slinku'i, and invalid lujvo. A lerpoi beginning with a
consonant cluster and the result of prepending a cmavo to it cannot both be
brivla; either the shorter is a slinku'i, or the longer is a tosmabru (or the
cmavo has more than three letters). (I am using "tosmabru" loosely.) The set
of strings that satisfy those properties and are not tosmabru or slinku'i, I
call greater brivla space. Strings in greater brivla space that aren't brivla
are invalid lujvo. There are two kinds: errors of hyphenation (lekymoi) and
errors of rafsi (lekybumoi).

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


posts: 14214

On Tue, Dec 21, 2004 at 06:51:58PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > Morphology pass: text=( CMAVO=( cmavo=( cuu ) ) )
> >
> > I assume this is a bug.
>
> Any valid vowel pair is accepted in cmavo: {.uau}, {miau}, {cuu},
> {kiy}, etc.
>
> I don't want to restrict them in cmavo unless they are equally
> restricted in cmene and fu'ivla.

OK; nevermind then.

-Robin


posts: 14214

On Tue, Dec 21, 2004 at 06:12:38PM -0800, Clark & Janiece Nelson
wrote:
> >The problem with doing it in PEG is that it appears to be
> >impossible to check that a string matches two different PEs with
> >the same number of characters matched for both. That's why every
> >selma'o PE ends with checking for a space or consonant, even
> >though "cmavo" already checked for that.
>
> Actually, that's the problem with doing it *in a single PEG
> grammar*, which is what I'm suggesting that we not do.

It's not a problem in *either* case. Pierre is stuck on it because
he, apparently, can't think in terms of string-wise decomposition
rather than algorithmics. I've got 6600 lines of comparison output
between his parser and the PEG one, and I see no evidence that this
"problem" is a problem.

-Robin


posts: 14214

On Tue, Dec 21, 2004 at 10:27:54PM -0500, Pierre Abbat wrote:
> On Tuesday 21 December 2004 20:46, Robin Lee Powell wrote:
> > On Tue, Dec 21, 2004 at 08:36:31PM -0500, Pierre Abbat wrote:
> > > The problem with doing it in PEG is that it appears to be
> > > impossible to check that a string matches two different PEs
> > > with the same number of characters matched for both. That's
> > > why every selma'o PE ends with checking for a space or
> > > consonant, even though "cmavo" already checked for that.
> >
> > 1. I have no idea what that has to do with the rest of the
> > conversation.
> >
> > 2. So what? It works quite well, you'll notice.
>
> In cmavo, yes. But when you consider lujvo and fu'ivla, especially
> lujvo with fu'ivla rafsi in them, you have to check whether the
> y-hyphens are necessary and whether something can be decomposed
> into rafsi with or without a consonant at the front removed, and
> the end of the brivla is marked, not by something simple such as a
> consonant, but by the stress, with an unstressed syllable (or two,
> if there's a 'y' present) after the stressed one.

Once again, I can't follow what you're saying. Clearly, you and I
think in different ways when it comes to this sort of thing.

I see no evidence that the current PEG grammar is doing anything in
a substantially incorrect fashion. If such evidence comes along,
I'll let you know.

-Robin


posts: 1912


> Hmm. Consider the following passage from CLL (4.3):
>
> All brivla have the following properties:
> 1) always end in a vowel;
> 2) always contain a consonant pair in the first five letters, where "y" and
> apostrophe are not counted as letters for this purpose;
> 3) always are stressed on the next-to-last (penultimate) syllable; this
> implies that they have two or more syllables.
>
> I always assumed this to be definitive, rather than descriptive:

That is true (I changed 2 slightly to "its second consonant is always
part of a cluster", but the point is the same). Those are properties
that all brivla have, but not everything with those properties is
a brivla.

> that any
> word having all these characteristics is defined as being a brivla; not that
> this happens to be true as a consequence of other rules.

No, that's not the case. For instance: {tosmabru}, {slinku'i}, {kniku},
{loertu}, {bytumi} all have those properties but are not brivla.

> I also believed
> that any brivla that didn't match the pattern of a gismu or lujvo was
> defined to be a fu'ivla.

That's true, but only because gismu, lujvo and fu'ivla are all the
kind of brivla there are. You first need to find if it's a gismu,
a lujvo or a fu'ivla, and only then can you conclude that it's
a brivla. You can't start with brivla without detecting the others
first.

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


On Wednesday 22 December 2004 07:11, Jorge "Llambías" wrote:
> No, that's not the case. For instance: {tosmabru}, {slinku'i}, {kniku},
> {loertu}, {bytumi} all have those properties but are not brivla.

I say {loertu} is a brivla (properly written {lo,ertu}), and the others
aren't.

phma
--
Mes règles mensuelles ont lieu une fois par an.
-Les Perles de la médecine


posts: 1912



> On Wednesday 22 December 2004 07:11, Jorge "Llambías" wrote:
> > No, that's not the case. For instance: {tosmabru}, {slinku'i}, {kniku},
> > {loertu}, {bytumi} all have those properties but are not brivla.
>
> I say {loertu} is a brivla (properly written {lo,ertu}), and the others
> aren't.

Yes, that's still a dubious case. I'm willing to go either way,
as long as cmene, cmavo and fu'ivla are all given the same
treatment.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 1912


Humanly readable algorithm for identifying fu'ivla.

A "syllable" is any permissible consonant cluster, or an apostrophe, or nothing, followed by a diphthong or by a single vowel.

Given a string of characters:

1. Check that it does not start with a cmene, a gismu or a lujvo.

2. Check whether it starts with a fu'ivla-head. A fu'ivla-head is something that looks like a cmavo without any y's. If there is no fu'ivla-head, go straight to 3.

A. If the fu'ivla-head is not followed by a consonant cluster, there is no fu'ivla (the head will fall off as a cmavo).

B. If the fu'ivla-head is followed by a non-initial cluster and one or more syllables, we have a fuhivla. If one of the syllables is stressed, the fu'ivla ends with the next syllable, otherwise it ends after the final syllable.

C. If the fu'ivla-head is followed by a permissible cluster, it may fall off. There is one case where it is saved: if only a single syllable follows the cluster, or if the head has a final stress so that it will accept only one more syllable. In those cases we have a fu'ivla.

3. If there is no fu'ivla head, that means we have a cluster. If it is not an initial-cluster, we don't ahve a valid word. If it is an initial cluster, it has to be followed by at least two syllables, and you need to check that adding {le} in front (or any other CV cmavo) does not convert it into a lujvo. If that doesn't happen, we have a fu'ivla.

In summary, we have just three types of possible fu'ivla:

1- Head fu'ivla with non-initial cluster plus tail.
2- Head fu'ivla with initial-cluster plus a single syllable ("short-tail")
3- Headless fu'ivla that pass the slinku'i test

mu'o mi'e xorxes

Jorge Llamb��)B�as scripsit:

> Any valid vowel pair is accepted in cmavo:
> {.uau}, {miau}, {cuu}, {kiy}, etc.
>
> I don't want to restrict them in cmavo unless they are equally restricted
> in cmene and fu'ivla.

The iy and uy sequences have *always* been restricted to use in cmene,
because cmene are a wastebasket category that can't threaten the morphology:
the only reason to restrict things in cmene is to preserve audio-visual
isomorphism. Other than that, they are reserved for something so important
and overriding that we absolutely need them. Using iy as fu'ivla-rafsi glue
falls in that category, but allowing them in random user-constructed
fu'ivla definitely does not. As for cmavo, we have more than enough long
cmavo capability without any need to allow iy and uy there.

It's my considered opinion that vowel glides beyond the standard diphthongs
shouldn't exist in Lojban at all, for the same reason that the forbidden
consonant clusters are forbidden: glides too threaten audio-visual isomorphism.
It's very hard to reliably distinguish between {u,au} and {u,uau}, or between
{ua,u} and {ua,uu}. In fact, uu and ii are only tolerable IMHO because they
are always preceded by ".".

It's time to tighten up. Sixteen diphthongs only, and of those, "iy" and "uy"
in cmene only, unless we can prove that "iy" is really fully usable for
fu'ivla-rafsi. The few corpus words that conflict with this should be
replaced one way or another.

--
But that, he realized, was a foolish John Cowan
thought; as no one knew better than he jcowan@reutershealth.com
that the Wall had no other side. http://www.ccil.org/~cowan
--Arthur C. Clarke, "The Wall of Darkness"


Jorge Llamb��)B�as scripsit:

> Well, mine too. Now is our time, so let's consider it. {iy} puts every
> fu'ivla (at least those that start with a consonant) on an equal footing.
> Just as with gismu, some are priviledged with shorter rafsi, but all
> have at least the long ones. Why not extend the same benefit to fu'ivla?

I'm not sure if Nora was able to prove that iy didn't work, or if she was
simply unable to prove that it did, but she's the person to ask.

--
When I'm stuck in something boring John Cowan
where reading would be impossible or (who loves Asimov too)
rude, I often set up math problems for jcowan@reutershealth.com
myself and solve them as a way to pass http://www.ccil.org/~cowan
the time. --John Jenkins http://www.reutershealth.com


posts: 1912


> The iy and uy sequences have *always* been restricted to use in cmene,
> because cmene are a wastebasket category that can't threaten the morphology:
> the only reason to restrict things in cmene is to preserve audio-visual
> isomorphism. Other than that, they are reserved for something so important
> and overriding that we absolutely need them. Using iy as fu'ivla-rafsi glue
> falls in that category, but allowing them in random user-constructed
> fu'ivla definitely does not. As for cmavo, we have more than enough long
> cmavo capability without any need to allow iy and uy there.

I agree "y" should not be allowed in fu'ivla other than as a hyphen
for rafsi. I don't see how allowing anything in cmavo is more or
less threatening to audio-visual isomorphism than allowing them in
cmene, especially since new cmavo will be extremely rare anyway,
whereas new cmene crop up all the time.

> It's my considered opinion that vowel glides beyond the standard diphthongs
> shouldn't exist in Lojban at all, for the same reason that the forbidden
> consonant clusters are forbidden: glides too threaten audio-visual
> isomorphism.

That's how the PEG is set to work now:

a can never be followed by a, e, o or y
e can never followed by a, e, o, u or y
o can never followed by a, e, o, u or y
y can never followed by a, e, i, o, or u

Those restrictions are absolute, no matter if there are
intervening commas. (An intervening apostrophe allows
any pair, the vowels are not adjacent then.)

In gismu: Only a, e, i, o, u. No vowel can be followed by another vowel.
In lujvo: ai, au, ei, oi are the only pairs allowed. y allowed as hyphen.
In fu'ivla: (i/u)(a/e/i/o/u) are added. Possibly iy as hyphen in lujvo.
In cmene and lujvo: iy, uy and yy are added.

fu'ivla, cmene and lujvo allow longer strings of vowels as
long as each adjacent pair is allowed.

> It's very hard to reliably distinguish between {u,au} and {u,uau}, or between
> {ua,u} and {ua,uu}. In fact, uu and ii are only tolerable IMHO because they
> are always preceded by ".".

It would be relatively easy to forbid vowel triples everywhere. We just
add "!(vowel-y vowel-y)" at the end of the vowel rules.

> It's time to tighten up. Sixteen diphthongs only, and of those, "iy" and
> "uy"
> in cmene only, unless we can prove that "iy" is really fully usable for
> fu'ivla-rafsi. The few corpus words that conflict with this should be
> replaced one way or another.

I don't see any reason to make cmene different from cmavo as far
as vowels are concerned. If it's pronounceable in Lojban, it can
be allowed in both places without any ambiguity. If it's not
pronounceable in Lojban, then obviously it is not pronounceable
anywhere. Arbitrarily restricting one but not the other in the
absence of ambiguity doesn't seem right.

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 14214

On Wed, Dec 22, 2004 at 12:09:30PM -0500, John Cowan wrote:
> Jorge Llamb??????)B???as scripsit:
>
> > Any valid vowel pair is accepted in cmavo: {.uau}, {miau},
> > {cuu}, {kiy}, etc.
> >
> > I don't want to restrict them in cmavo unless they are equally
> > restricted in cmene and fu'ivla.
>
> The iy and uy sequences have *always* been restricted to use in
> cmene, because cmene are a wastebasket category that can't
> threaten the morphology:

As I've said before, I'm not reading any of the hardcore morphology
discussion. I *hope*, though, that someone is writing up anything
that might be controversial, for when it comes time to vote, just as
I did with the main grammar.

-Robin


posts: 14214

On Tue, Dec 21, 2004 at 06:49:01PM -0800, Jorge Llamb?as wrote:
>
> > The grammar in its current state does four separable things:
> > 1. It partitions the input stream into words.
> > 2. It validates the words, rejecting invalid vowel and consonant patterns.
> > 3. It determines the selma'o of a cmavo.
> > 4. It categorizes brivla into gismu, lujvo and fu'ivla.
> >
> > As a result, the grammar is fearsomely complex in spots.
>
> Yes. Unfortunately, this is unavoidable. Lojban morphology is an
> ugly monster, that's a fact.
>
> It was me who asked Robin to separate the morphology from the main
> syntax part of the grammar. The determination of selmaho is not
> part of what I did, and I agree it belongs in a separate module,
> but the way it is written now, you can ignore the selmaho part and
> it works with just "words" at the highest level.

I could move the selma'o determination to the main grammar, and may
very well do so, but it was easier at the time to add it to the
morphologoy.

-Robin


posts: 1912


> I'm not sure if Nora was able to prove that iy didn't work, or if she was
> simply unable to prove that it did, but she's the person to ask.

It all depends on how it was supposed to work. fu'ivla-rafsi
can't be combined with all normal rafsi: the immediately preceding
one has to be CVCy-, CCVCy-, CVCCy- or another fu'ivla-rafsi.
This is always achievable because every gismu has a four-letter
rafsi. Given that, iy clearly does work: it can be added unambiguously
after any fu'ivla and it could never be confused with a normal
rafsi. The assumption here is that "i" can be added to any final
vowel string. If we forbid vowel triples this won't work.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com


posts: 1912


> I could move the selma'o determination to the main grammar, and may
> very well do so, but it was easier at the time to add it to the
> morphologoy.

I don't have a problem with it as it stands. In fact, I wouldn't
want selmaho sorting to be a part of the main grammar. I see
three stages: (1) A character string split into words by their form,
(2) words sorted into selmaho (3) a string of selmaho parsed as
sentences. Each stage isolated from the others as much as possible
(not necessarily in different files, just in clearly delimited
sections).

Stage (2) is longwinded but trivial: three word forms collapse into
one selma'o (BRIVLA), one word form is moved directly into
its own selma'o (CMENE) and one word form is split into a hundred
and some selmaho (A, BAI, ... ZOhU).

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 14214

Cage match between camxes and valfendi, round one:

$ echo "muSTEl,aVIson" | valfendi -a -l -s
>muSTE< -l,a VIson.

(which means "muSTE" is a non-Lojban word, "l,a" is a cmavo, and
"VIson" is a cmene)

$ echo "muSTEl,aVIson" | myparser -m
text
|- CMAVO
| PA: mu
|- BRIVLA
| gismu: STEl,a
|- CMENE
cmene: VIson

Seems to me like valfendi's bad here, but I'll let you guys fight it
out.

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 10:58:13AM -0800, Robin Lee Powell wrote:
> Cage match between camxes and valfendi, round one:
>
> $ echo "muSTEl,aVIson" | valfendi -a -l -s
> >muSTE< -l,a VIson.
>
> (which means "muSTE" is a non-Lojban word, "l,a" is a cmavo, and
> "VIson" is a cmene)
>
> $ echo "muSTEl,aVIson" | myparser -m
> text
> |- CMAVO
> | PA: mu
> |- BRIVLA
> | gismu: STEl,a
> |- CMENE
> cmene: VIson
>
> Seems to me like valfendi's bad here, but I'll let you guys fight it
> out.

Heh.

      • Sentence: muSTElaVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

      • Sentence: muSTEla.VIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

      • Sentence: muSTE.laVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

      • Sentence: muSTE.la.VIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

      • Sentence: muSTEl,aVIson 1

MISMATCH!
valfendi: >muSTE< -l,a VIson.
pegbased: -mu (STEl,a) VIson.

These have been normalized to valfendi's format, which is:

>nonLojbanWord< (brivla) -ma'o cmen.

valfendi *really* doesn't want to take that mu.

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 11:36:40AM -0800, Robin Lee Powell wrote:
> On Wed, Dec 22, 2004 at 10:58:13AM -0800, Robin Lee Powell wrote:
> > Cage match between camxes and valfendi, round one:
> >
> > $ echo "muSTEl,aVIson" | valfendi -a -l -s
> > >muSTE< -l,a VIson.
> >
> > (which means "muSTE" is a non-Lojban word, "l,a" is a cmavo, and
> > "VIson" is a cmene)
> >
> > $ echo "muSTEl,aVIson" | myparser -m
> > text
> > |- CMAVO
> > | PA: mu
> > |- BRIVLA
> > | gismu: STEl,a
> > |- CMENE
> > cmene: VIson
> >
> > Seems to me like valfendi's bad here, but I'll let you guys fight it
> > out.
>
> Heh.
>
> *** Sentence: muSTElaVIson 1
> MISMATCH!
> valfendi: >muSTE< -la VIson.
> pegbased: -mu (STEla) VIson.

Despite my insistence to not get involved, we figured this out in

  1. lojban. They're both full of shit.


camxes is allowing cmene without a preceding pause *or* {la}, which
is especially insane if it's supposed to allow cmene with la/lai/doi
in them, which modification *requires* that there be a pause before

  • all* cmene.


valfendi, OTOH, is invalidating potentially valid words on the

  • left* (mu stela) in favour of valid words on the *right* (la

vison), which is so much unlike how a human listener would deal with
the issue that I'm quite stunned.

-Robin



posts: 1912



> *** Sentence: muSTElaVIson 1
> MISMATCH!
> valfendi: >muSTE< -la VIson.
> pegbased: -mu (STEla) VIson.

Hmmm... This is arguable both ways, because names are
not supposed to be allowed without an initial pause
unless they follow doi/la/lai/la'i. We'd have to decide
whether the stress-syllable rule for brivla or the
rule for cmene has priority. (I go with left-to-right
priority.)

> *** Sentence: muSTE.laVIson 1
> MISMATCH!
> valfendi: >muSTE< -la VIson.
> pegbased: -mu (STEla) VIson.

Oh dear, does PEG really do that?
Which rule allows it to absorb that dot?

mu'o mi'e xorxes





__
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail.yahoo.com


posts: 14214

On Wed, Dec 22, 2004 at 12:03:57PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > *** Sentence: muSTElaVIson 1
> > MISMATCH!
> > valfendi: >muSTE< -la VIson.
> > pegbased: -mu (STEla) VIson.
>
> Hmmm... This is arguable both ways, because names are not supposed
> to be allowed without an initial pause unless they follow
> doi/la/lai/la'i. We'd have to decide whether the stress-syllable
> rule for brivla or the rule for cmene has priority. (I go with
> left-to-right priority.)

I would as well.

> > *** Sentence: muSTE.laVIson 1
> > MISMATCH!
> > valfendi: >muSTE< -la VIson.
> > pegbased: -mu (STEla) VIson.
>
> Oh dear, does PEG really do that?

As it turns out, no:

text
|- CMAVO
| PA: mu
|- nonLojbanWord: STE
|- spaces: .
|- CMAVO
| LA: la
|- CMENE
cmene: VIson

AKA -mu >STE.< -la VIson.

Let me figure out where my test script is borked.

-Robin


posts: 1912



> > *** Sentence: muSTElaVIson 1
>
> camxes is allowing cmene without a preceding pause *or* {la}, which
> is especially insane if it's supposed to allow cmene with la/lai/doi
> in them, which modification *requires* that there be a pause before
> *all* cmene.

We don't require a pause before doi/la/lai/la'i. If we did,
there would be no point to the !doi-la-lai-la'i restriction.
Both camxes and valfendi allow doi/la/lai/la'i when preceded by
a consonant or followed by a vowel, that's all, because in those
cases there is no ambiguity.

I would vote to abolish the doi-la-lai-la'i restriction on
cmene altogether, and always require pauses at both ends of
cmene. Then this sentence would not be problematic, it's
just a cmene.

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 14214

> > > *** Sentence: muSTE.laVIson 1
> > > MISMATCH!
> > > valfendi: >muSTE< -la VIson.
> > > pegbased: -mu (STEla) VIson.
> >
> > Oh dear, does PEG really do that?
>
> As it turns out, no:
[snip]
> Let me figure out where my test script is borked.

Fixed:

      • Sentence: muSTE.laVIson
      • Sentence: muSTE.laVIson ESC[033m1ESC[000m

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu >STE.< -la VIson.

The new mismatch list for these words is:

      • Sentence: muSTElaVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

      • Sentence: MUstela.VIson 1

MISMATCH!
valfendi: (MUste) -la VIson.
pegbased: (MUste) -la. VIson.

      • Sentence: muSTEla.VIson 1

MISMATCH!
valfendi: -mu (STEla) VIson.
pegbased: -mu (STEla.) VIson.

      • Sentence: MUste.laVIson 1

MISMATCH!
valfendi: (MUste) -la VIson.
pegbased: (MUste.) -la VIson.

      • Sentence: muSTE.laVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu >STE.< -la VIson.

      • Sentence: MUste.la.VIson 1

MISMATCH!
valfendi: (MUste) -la VIson.
pegbased: (MUste.) -la. VIson.

      • Sentence: muSTE.la.VIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu >STE.< -la. VIson.

      • Sentence: muSTEl,aVIson 1

MISMATCH!
valfendi: >muSTE< -l,a VIson.
pegbased: -mu (STEl,a) VIson.

Unfortunately, valfendi drops . and my parser doesn't, so these
don't compare properly in several cases. I'll have to hack my test
script more.

<sigh>

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 12:24:52PM -0800, Robin Lee Powell wrote:
> Unfortunately, valfendi drops . and my parser doesn't, so these
> don't compare properly in several cases. I'll have to hack my
> test script more.

Fixed, new list:

      • Sentence: muSTElaVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

      • Sentence: muSTE.laVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu >STE< -la VIson.

      • Sentence: muSTE.la.VIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu >STE< -la VIson.

      • Sentence: muSTEl,aVIson 1

MISMATCH!
valfendi: >muSTE< -l,a VIson.
pegbased: -mu (STEl,a) VIson.

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 12:23:19PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
>
> > > *** Sentence: muSTElaVIson 1
> >
> > camxes is allowing cmene without a preceding pause *or* {la},
> > which is especially insane if it's supposed to allow cmene with
> > la/lai/doi in them, which modification *requires* that there be
> > a pause before *all* cmene.
>
> We don't require a pause before doi/la/lai/la'i. If we did, there
> would be no point to the !doi-la-lai-la'i restriction. Both camxes
> and valfendi allow doi/la/lai/la'i when preceded by a consonant or
> followed by a vowel, that's all, because in those cases there is
> no ambiguity.
>
> I would vote to abolish the doi-la-lai-la'i restriction on cmene
> altogether, and always require pauses at both ends of cmene.

Umm, I thought that you were already doing this? Oh, I seem to have
misunderstood valfendi's -a option:

-a cmevla can contain {la'u}, {doie}, etc.

That's a seperate issue, isn't it?

> Then this sentence would not be problematic, it's just a cmene.

Right.

-Robin


Jorge Llamb��)B�as scripsit:

> Hmmm... This is arguable both ways, because names are
> not supposed to be allowed without an initial pause
> unless they follow doi/la/lai/la'i. We'd have to decide
> whether the stress-syllable rule for brivla or the
> rule for cmene has priority. (I go with left-to-right
> priority.)

The traditional understanding (as in the 1988 morphology) is that
the cmene rule has absolute priority: a cmene is a maximal string
of letters ending with consonant+pause and not containing la, lai,
or doi (unless preceded by a consonant).

--
At the end of the Metatarsal Age, the dinosaurs John Cowan
abruptly vanished. The theory that a single jcowan@reutershealth.com
catastrophic event may have been responsible www.reutershealth.com
has been strengthened by the recent discovery of www.ccil.org/~cowan
a worldwide layer of whipped cream marking the
Creosote-Tutelary boundary. --Science Made Stupid


Jorge Llamb��)B�as scripsit:

> I would vote to abolish the doi-la-lai-la'i restriction on
> cmene altogether, and always require pauses at both ends of
> cmene. Then this sentence would not be problematic, it's
> just a cmene.

Arrgh. Pauses are obnoxious, and the more of them, the worse.

--
Andrew Watt on Microsoft: John Cowan
"Never in the field of human computing jcowan@reutershealth.com
has so much been paid by so many http://www.ccil.org/~cowan
to so few!" (pace Winston Churchill) http://www.reutershealth.com


posts: 1912



> The traditional understanding (as in the 1988 morphology) is that
> the cmene rule has absolute priority: a cmene is a maximal string
> of letters ending with consonant+pause and not containing la, lai,
> or doi (unless preceded by a consonant).

Yes, we all agree with that part.

musTElaVIson

produces {VIson} as a cmene. That's the maximal string
of letters ending with consonant+pause and not containing
la, lai, or doi

The question is: What about cmene that follow a syllable
{la} that is not a cmavo. Do they have to begin with a pause
or not?

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 14214

So far as I know, Lojban syllabification has never been clearly
defined.

I assume that both cmaxes and valfendi make decisions about this.
Is there any way to know if they've made the same decisions?

-Robin


posts: 1912


>
> Arrgh. Pauses are obnoxious, and the more of them, the worse.

Don't some languages have the glottal stop as just another phoneme?
Are they obnoxious just because we don't happen to have them as
phonemes in our native language, or are they obnoxious in a more
fundamental naturalistic sense?

mu'o mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 1912


> So far as I know, Lojban syllabification has never been clearly
> defined.
>
> I assume that both cmaxes and valfendi make decisions about this.
> Is there any way to know if they've made the same decisions?

There is no need to fully define syllabification. All we need
is to agree on what counts as a syllable core, and we all
agree about that: Only ai, au, ai, oi, ia, ie, ii, io, iu,
ua, ue, ui, uo, uu, a, e, i, o, and u count as syllable cores.
Nothing else. And vowel strings of more than two vowels break
in pairs from the left, as long as the pair is a valid diphthong.

The only doubt is whether we can use a comma to override
the left to right pairing. We seem to agree that we cannot.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




Jorge Llamb��)B�as scripsit:

> Don't some languages have the glottal stop as just another phoneme?
> Are they obnoxious just because we don't happen to have them as
> phonemes in our native language, or are they obnoxious in a more
> fundamental naturalistic sense?

Many languages have glottal stop as a phoneme, and it's not rare in
some varieties of English as an allophone of /t/. But few languages have
consonant clusters involving it, as Lojban does.

--
Not to perambulate John Cowan <jcowan@reutershealth.com>
the corridors http://www.reutershealth.com
during the hours of repose http://www.ccil.org/~cowan
in the boots of ascension. --Sign in Austrian ski-resort hotel


Robin Lee Powell scripsit:
> So far as I know, Lojban syllabification has never been clearly
> defined.

The rule is, AFAIK, that every vowel or diphthong makes a syllable, and
that consonant clusters are split between syllables unless they are a
permissible initial, in which case they both belong to the following
syllable. All such rules are artificial: there is no objective
cross-language definition of "syllable".

--
"And it was said that ever after, if any John Cowan
man looked in that Stone, unless he had a jcowan@reutershealth.com
great strength of will to turn it to other www.ccil.org/~cowan
purpose, he saw only two aged hands withering www.reutershealth.com
in flame." --"The Pyre of Denethor"


On Wednesday 22 December 2004 13:11, Jorge "Llambías" wrote:
> That's how the PEG is set to work now:
>
> a can never be followed by a, e, o or y
> e can never followed by a, e, o, u or y
> o can never followed by a, e, o, u or y
> y can never followed by a, e, i, o, or u
>
> Those restrictions are absolute, no matter if there are
> intervening commas. (An intervening apostrophe allows
> any pair, the vowels are not adjacent then.)
>
> In gismu: Only a, e, i, o, u. No vowel can be followed by another vowel.
> In lujvo: ai, au, ei, oi are the only pairs allowed. y allowed as hyphen.
> In fu'ivla: (i/u)(a/e/i/o/u) are added. Possibly iy as hyphen in lujvo.
> In cmene and lujvo: iy, uy and yy are added.
>
> fu'ivla, cmene and lujvo allow longer strings of vowels as
> long as each adjacent pair is allowed.

Valfendi is set up like this:
In gismu: Only a, e, i, o, u.
In lujvo: ai, au, ei, oi are the only pairs allowed. y allowed as hyphen.
In fu'ivla: All pairs (a/e/i/o/u)(a/e/i/o/u) are allowed. If the two vowels do
not form a diphthong, they are assumed to have a comma between them.
Arbitrarily long vowel strings are allowed.
In fu'ivla lujvo: All pairs (a/e/i/o/u)(a/e/i/o/u) are allowed. y is allowed
between consonants and must occur at least once.
In cmene: All pairs are allowed. iy and uy are diphthongs. If the two vowels
do not form a diphthong, they are assumed to have a comma between them.
In cmavo: ai, au, ei, oi are allowed in arbitrarily long cmavo.
(i/u)(a/e/i/o/u) are allowed in two-letter cmavo. No more than two vowels in
a row are allowed; a cmavo with more than two vowels must contain an
apostrophe.

phma
--
Maintenant, j'ai besoin d'une loupe pour trouver mes lunettes!
-Les Perles de la médecine


On Wednesday 22 December 2004 09:20, wikidiscuss@lojban.org wrote:
> Re: PEG Morphology Algorithm
>
> Humanly readable algorithm for identifying fu'ivla.
>
> A "syllable" is any permissible consonant cluster, or an apostrophe, or
> nothing, followed by a diphthong or by a single vowel.
>
> Given a string of characters:
>
> 1. Check that it does not start with a cmene, a gismu or a lujvo.
>
> 2. Check whether it starts with a fu'ivla-head. A fu'ivla-head is something
> that looks like a cmavo without any y's. If there is no fu'ivla-head, go
> straight to 3.
>
> A. If the fu'ivla-head is not followed by a consonant cluster, there is
> no fu'ivla (the head will fall off as a cmavo).
>
> B. If the fu'ivla-head is followed by a non-initial cluster and one or
> more syllables, we have a fuhivla. If one of the syllables is stressed, the
> fu'ivla ends with the next syllable, otherwise it ends after the final
> syllable.
>
> C. If the fu'ivla-head is followed by a permissible cluster, it may fall
> off. There is one case where it is saved: if only a single syllable follows
> the cluster, or if the head has a final stress so that it will accept only
> one more syllable. In those cases we have a fu'ivla.

There are two cases. The other is that the string following the head is a
slinku'i, and the fu'ivla-head is not of CV form (which makes a lujvo). E.g.
{suicmardi} is a fu'ivla.

phma
--
GCS/M d- s-: a+ C++ UL++++$ P+ L+++ E- W+++ N+ o? K? w-- O? M- V- Y++
PGP++ t- 5? X? R- !tv b++ DI !D G e++ h+>---- r- y>+++


posts: 14214


Round two:

valfendi: -la mer. samo,as.
pegbased: -la mer. -sa >mo,as<

-Robin


On Wednesday 22 December 2004 13:58, Robin Lee Powell wrote:
> Cage match between camxes and valfendi, round one:
>
> $ echo "muSTEl,aVIson" | valfendi -a -l -s
>
> >muSTE< -l,a VIson.
>
> (which means "muSTE" is a non-Lojban word, "l,a" is a cmavo, and
> "VIson" is a cmene)

On finding {la} before a cmene without a pause between them, it breaks off
what's on both sides of {la}. This is the way BRKWORDS did it.

If camxes and valfendi give different output on an invalid string (in this
case, {mu stela vison} requires a pause before {vison}, and btw when I try
{muSTEla.VIson} I get {mu stela vison}), that is not necessarily a bug.

phma
--
Mes règles mensuelles ont lieu une fois par an.
-Les Perles de la médecine


On Wednesday 22 December 2004 16:15, Jorge "Llambías" wrote:
> --- Robin Lee Powell wrote:
> > So far as I know, Lojban syllabification has never been clearly
> > defined.
> >
> > I assume that both cmaxes and valfendi make decisions about this.
> > Is there any way to know if they've made the same decisions?
>
> There is no need to fully define syllabification. All we need
> is to agree on what counts as a syllable core, and we all
> agree about that: Only ai, au, ai, oi, ia, ie, ii, io, iu,
> ua, ue, ui, uo, uu, a, e, i, o, and u count as syllable cores.
> Nothing else. And vowel strings of more than two vowels break
> in pairs from the left, as long as the pair is a valid diphthong.
>
> The only doubt is whether we can use a comma to override
> the left to right pairing. We seem to agree that we cannot.

I say it can, but the only time it matters in valfendi is in a brivla that
ends with a diphthong. If you say {draga,u}, it expects the second A to be
stressed.

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


posts: 14214

On Wed, Dec 22, 2004 at 06:51:36PM -0500, Pierre Abbat wrote:
> If camxes and valfendi give different output on an invalid string
snip
> that is not necessarily a bug.

That's a good point, but in many of these cases I'm going to need
you guys to tell me what is and is not a valid string.

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 04:12:08PM -0800, Robin Lee Powell wrote:
> On Wed, Dec 22, 2004 at 06:51:36PM -0500, Pierre Abbat wrote:
> > If camxes and valfendi give different output on an invalid
> > string
> snip
> > that is not necessarily a bug.
>
> That's a good point, but in many of these cases I'm going to need
> you guys to tell me what is and is not a valid string.

For example, in this case one of you thinks it's invalid, the other
does not:

      • Sentence: muSTElaVIson 1

MISMATCH!
valfendi: >muSTE< -la VIson.
pegbased: -mu (STEla) VIson.

Morphologically invalid, I mean. Both cases are grammatically
invalid.

I'm pretty sure camxes is wrong on this one.

-Robin


posts: 14214

I've got a *lot* of these:

      • Sentence: fi'oricyrAtcu airicyrAtcu ruericyrAtcu ioricyrAtcu 1

MISMATCH!
valfendi: -fi'o (ricyrAtcu) -ai (ricyrAtcu) >rue< (ricyrAtcu) -io (ricyrAtcu)
pegbased: -fi'o (ricyrAtcu) -ai (ricyrAtcu) -rue (ricyrAtcu) -io (ricyrAtcu)

It seems that there are a *bunch* of cases where camxes accepts
cmavo that valfendi does not. I could probably find several hundred
if I tried. You guys should hash that out.

-Robin


posts: 14214

Ummm...

      • Sentence: .ui mi facki fi le mi mapku 1

MISMATCH!
valfendi: -ui -mi (facki) -fi -le -mi >mapku<
pegbased: -ui -mi (facki) -fi -le -mi (mapku)

That's a *bug*, Pierre.

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 04:27:24PM -0800, Robin Lee Powell wrote:
> Ummm...
>
> *** Sentence: .ui mi facki fi le mi mapku 1
> MISMATCH!
> valfendi: -ui -mi (facki) -fi -le -mi >mapku<
> pegbased: -ui -mi (facki) -fi -le -mi (mapku)
>
> That's a *bug*, Pierre.

      • Sentence: ti poi ke'a nazbi kapkevna ku'o cu barda 1

MISMATCH!
valfendi: -ti -poi -ke'a (nazbi) >kapkevna< -ku'o -cu (barda)
pegbased: -ti -poi -ke'a (nazbi) (kapkevna) -ku'o -cu (barda)

An aversion to pk, apparently.

-Robin


posts: 1912


> > C. If the fu'ivla-head is followed by a permissible cluster, it may fall
> > off. There is one case where it is saved: if only a single syllable follows
> > the cluster, or if the head has a final stress so that it will accept only
> > one more syllable. In those cases we have a fu'ivla.
>
> There are two cases. The other is that the string following the head is a
> slinku'i, and the fu'ivla-head is not of CV form (which makes a lujvo). E.g.
> {suicmardi} is a fu'ivla.

Ouch! You're very right, I had completely missed that case. I have now
modified fuhivla to:

fuhivla <- !cmene !gismu !lujvo (stressed-fuhivla-head cluster fuhivla-tail /
fuhivla-head cluster stressed-fuhivla-tail) / !CVC-rafsi &cmavo-form syllable
(!consonant syllable)* slinkuhi

The final alternative should take care of that very odd case.

(fuhivla-rafsi was also modified accordingly.)

ki'e mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


posts: 1912


> Round two:
>
> valfendi: -la mer. samo,as.
> pegbased: -la mer. -sa >mo,as<

That's a known difference. valfendi is more permissive with
vowel groups in cmene and fuhivla. If we want to match valfendi
in this regard, we have to eliminate the !a !e etc. at the end
of the vowel rules.

mu'o mi'e xorxes





__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


posts: 1912


> On Wednesday 22 December 2004 16:15, Jorge "Llambías" wrote:

> > The only doubt is whether we can use a comma to override
> > the left to right pairing. We seem to agree that we cannot.
>
> I say it can, but the only time it matters in valfendi is in a brivla that
> ends with a diphthong. If you say {draga,u}, it expects the second A to be
> stressed.

But then, is {DRAga,u} allowed? Is it different from {DRAgau}?

camxes (correctly, I think) rejects {dragA,u}.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


On Wednesday 22 December 2004 19:14, Robin Lee Powell wrote:
> On Wed, Dec 22, 2004 at 04:12:08PM -0800, Robin Lee Powell wrote:
> > On Wed, Dec 22, 2004 at 06:51:36PM -0500, Pierre Abbat wrote:
> > > If camxes and valfendi give different output on an invalid
> > > string
> >
> > snip
> >
> > > that is not necessarily a bug.
> >
> > That's a good point, but in many of these cases I'm going to need
> > you guys to tell me what is and is not a valid string.
>
> For example, in this case one of you thinks it's invalid, the other
> does not:
>
> *** Sentence: muSTElaVIson 1
> MISMATCH!
> valfendi: >muSTE< -la VIson.
> pegbased: -mu (STEla) VIson.
>
> Morphologically invalid, I mean. Both cases are grammatically
> invalid.
>
> I'm pretty sure camxes is wrong on this one.

It's invalid as an encoding of {mu stela vison} because the cmene is preceded
by a brivla without a pause between them. It's invalid as an encoding of
{muste la vison} because the accent is on the wrong syllable.

{kybuladjan} is invalid because {ky} needs a pause after it. Both lexers,
however, lex this as {ky bu la djan} (or so xorxes claims for camxes). The
official rules state that the pause must be between the Cy and the next word
that isn't Cy, but I figured out that it can be between the Cy and the next
word that contains CVV, CV'V, or CCV, so I say {kybu.ladjan}.

{kymoi}, {kybumoi}, {kybumlatu}, {lekymoi}, {lekybumoi}, and {lekybumlatu} are
more phrases with the pause after the lervla missing. valfendi thinks they
all contain brivla, but errors out trying to identify it, except for {ky bu
mlatu}.

phma
--
S Fa1>+/- !TM M-- K H T-- t? AT++ SY Te- SC- FO- D P !Tz E++ L


posts: 1912


> On Wed, Dec 22, 2004 at 04:12:08PM -0800, Robin Lee Powell wrote:
> > On Wed, Dec 22, 2004 at 06:51:36PM -0500, Pierre Abbat wrote:
> > > If camxes and valfendi give different output on an invalid
> > > string
> > snip
> > > that is not necessarily a bug.
> >
> > That's a good point, but in many of these cases I'm going to need
> > you guys to tell me what is and is not a valid string.

I don't think that's right. If the string is invalid, both parsers
should say it is invalid. If they say anything else, it is a bug.

> For example, in this case one of you thinks it's invalid, the other
> does not:
>
> *** Sentence: muSTElaVIson 1
> MISMATCH!
> valfendi: >muSTE< -la VIson.
> pegbased: -mu (STEla) VIson.
>
> Morphologically invalid, I mean. Both cases are grammatically
> invalid.

Make it grammatically valid:

lo'u musSTElaVIson le'u lojbo valsi

> I'm pretty sure camxes is wrong on this one.

I'm not so sure. I'm inclined to say it is not wrong, because the
rules for identifying cmene are purely *morphological*. They should
not rely on identifying the preceding "la" as a gadri. Any syllable
"la" will allow the cmene to skip the initial pause.

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 1912



> I've got a *lot* of these:
>
> *** Sentence: fi'oricyrAtcu airicyrAtcu ruericyrAtcu ioricyrAtcu 1
> MISMATCH!
> valfendi: -fi'o (ricyrAtcu) -ai (ricyrAtcu) >rue< (ricyrAtcu) -io (ricyrAtcu)
> pegbased: -fi'o (ricyrAtcu) -ai (ricyrAtcu) -rue (ricyrAtcu) -io (ricyrAtcu)
>
> It seems that there are a *bunch* of cases where camxes accepts
> cmavo that valfendi does not. I could probably find several hundred
> if I tried. You guys should hash that out.

That's a known difference: camxes allows the y-hyphen after any CVC-rafsi,
whether required or not.

This can probably be changed with some work, but I think this is a
feature, not a bug. In fact, I'm inclined to do the same for the
r/n-hyphen after CVV.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


On Wednesday 22 December 2004 19:27, Robin Lee Powell wrote:
> Ummm...
>
> *** Sentence: .ui mi facki fi le mi mapku 1
> MISMATCH!
> valfendi: -ui -mi (facki) -fi -le -mi >mapku<
> pegbased: -ui -mi (facki) -fi -le -mi (mapku)
>
> That's a *bug*, Pierre.

In pairtable, change the 'p' line to:
/*p*/ " + + +=++ =++ + ",

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine


posts: 1912


> Valfendi is set up like this:
> In gismu: Only a, e, i, o, u.
> In lujvo: ai, au, ei, oi are the only pairs allowed. y allowed as hyphen.
> In fu'ivla: All pairs (a/e/i/o/u)(a/e/i/o/u) are allowed. If the two vowels
> do
> not form a diphthong, they are assumed to have a comma between them.
> Arbitrarily long vowel strings are allowed.
> In fu'ivla lujvo: All pairs (a/e/i/o/u)(a/e/i/o/u) are allowed. y is allowed
> between consonants and must occur at least once.
> In cmene: All pairs are allowed. iy and uy are diphthongs. If the two vowels
> do not form a diphthong, they are assumed to have a comma between them.

All that makes sense to me: maximally permissive.

> In cmavo: ai, au, ei, oi are allowed in arbitrarily long cmavo.
> (i/u)(a/e/i/o/u) are allowed in two-letter cmavo. No more than two vowels in
> a row are allowed; a cmavo with more than two vowels must contain an
> apostrophe.

This, however, which is maximally restrictive, makes no sense to me
in conjunction with the above rules. If you are maximally permissive
with cmene and fu'ivla I see no reason not to be equally permissive
with cmavo.

I don't have a strong opinion on which way we should go as far as
permissiveness of vowel pairs, but I do think we should be consistent
and not have arbitrary restrictions that apply to one class of words
and not to other, for no aparent reason.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 1912



> > *** Sentence: muSTElaVIson 1
> > MISMATCH!
> > valfendi: >muSTE< -la VIson.
> > pegbased: -mu (STEla) VIson.
>
> It's invalid as an encoding of {mu stela vison} because the cmene is preceded

> by a brivla without a pause between them.

Is there a rule that says that a cmene can't be preceded by a brivla without
a pause between them? That would be odd, because cmene can practically
never appear after a brivla. Isn't the cmene morphology rule about the
syllable {la} rather than the cmavo {la}? And if it isn't, shouldn't it be?
The morphology should not care about what the words mean, only about their
form.

> {kybuladjan} is invalid because {ky} needs a pause after it. Both lexers,
> however, lex this as {ky bu la djan} (or so xorxes claims for camxes). The
> official rules state that the pause must be between the Cy and the next word
> that isn't Cy, but I figured out that it can be between the Cy and the next
> word that contains CVV, CV'V, or CCV, so I say {kybu.ladjan}.

Right. The official rules are more strict than they need to be here.
Both parsers have a bug with respect to the official rules, but this will
not be a bug with respect to the new official rules if they are approved.

> {kymoi}, {kybumoi}, {kybumlatu}, {lekymoi}, {lekybumoi}, and {lekybumlatu}
> are
> more phrases with the pause after the lervla missing. valfendi thinks they
> all contain brivla, but errors out trying to identify it, except for {ky bu
> mlatu}.

camxes should give:

ky moi
ky bu moi
ky bu mlatu
lekymoi (= lekmoi)
le ky bu moi
le ky bu mlatu

Again, these are bugs with respect to the official rules, which are
more strict than required for unambiguity. We could force these to be
errors, but that seems pointless.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


Jorge Llamb��)B�as scripsit:

> The question is: What about cmene that follow a syllable
> {la} that is not a cmavo. Do they have to begin with a pause
> or not?

I guess the answer is that it doesn't matter much, since such a situation
is always going to be a syntax error anyhow.

--
Real FORTRAN programmers can program FORTRAN John Cowan
in any language. --Allen Brown jcowan@reutershealth.com


Jorge Llamb��)B�as scripsit:

> In cmene and lujvo: iy, uy and yy are added.

I assume you mean cmavo rather than lujvo. "yy" should not be allowed;
it has no defined pronunciation.

> It would be relatively easy to forbid vowel triples everywhere. We just
> add "!(vowel-y vowel-y)" at the end of the vowel rules.

I believe we should do so. This is a break with the past, but a modest one.

> I don't see any reason to make cmene different from cmavo as far
> as vowels are concerned.

The point is that iy and uy are reserved for a morphological mechanism.
Can arbitrary fu'ivla lujvo be constructed using iy after an initial
fu'ivla, iy before and after a medial fu'ivla, and iy before a final
fu'ivla? That was the original design.

As I said before, allowing iy and uy in cmene even when they are reserved
otherwise is safe because cmene are defined backwards.

--
Schlingt dreifach einen Kreis vom dies! John Cowan <jcowan@reutershealth.com>
Schliesst euer Aug vor heiliger Schau, http://www.reutershealth.com
Denn er genoss vom Honig-Tau, http://www.ccil.org/~cowan
Und trank die Milch vom Paradies. — Coleridge (tr. Politzer)


Jorge Llamb��)B�as scripsit:

> It all depends on how it was supposed to work. fu'ivla-rafsi
> can't be combined with all normal rafsi: the immediately preceding
> one has to be CVCy-, CCVCy-, CVCCy- or another fu'ivla-rafsi.

But when iy is the glue?

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
Reversing the apostolic precept to be all things to all men, I usually before
Darwin
defended the tenability of the received doctrines, when I had to do
with the evolutionists; and stood up for the possibility of evolution among
the orthodox — thereby, no doubt, increasing an already current, but quite
undeserved, reputation for needless combativeness. --T. H. Huxley


posts: 14214

On Wed, Dec 22, 2004 at 04:57:48PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > Round two:
> >
> > valfendi: -la mer. samo,as.
> > pegbased: -la mer. -sa >mo,as<
>
> That's a known difference. valfendi is more permissive with vowel
> groups in cmene and fuhivla.

camxes is more permissive in cmene, though.

valfendi: -zoi -fy booz. -fy -co -sa -zoi bar. baz. bar.
pegbased: -zoi -fy >booz< -fy -co -sa -zoi bar. baz. bar.

> If we want to match valfendi in this regard, we have to eliminate
> the !a !e etc. at the end of the vowel rules.

I don't much care what you two agree on, so long as you do.

This may not be possible, of course.

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 05:17:11PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpowell@digitalkingdom.org> wrote:
>
> > I've got a *lot* of these:
> >
> > *** Sentence: fi'oricyrAtcu airicyrAtcu ruericyrAtcu ioricyrAtcu 1
> > MISMATCH!
> > valfendi: -fi'o (ricyrAtcu) -ai (ricyrAtcu) >rue< (ricyrAtcu) -io (ricyrAtcu)
> > pegbased: -fi'o (ricyrAtcu) -ai (ricyrAtcu) -rue (ricyrAtcu) -io (ricyrAtcu)
> >
> > It seems that there are a *bunch* of cases where camxes accepts
> > cmavo that valfendi does not. I could probably find several hundred
> > if I tried. You guys should hash that out.
>
> That's a known difference: camxes allows the y-hyphen after any
> CVC-rafsi, whether required or not.

That has nothing whatever to do with this case that I can see. This
is about whether rue is a valid cmavo.

-Robin


posts: 14214

How about this:

valfendi: (soindebi) (soindebytsi) (betysoindebytsi) (betysoindebi)
pegbased: (soindebi) (soindebytsi) (betysoindebytsi) -be -ty (soindebi)

-Robin


posts: 1912


> On Wed, Dec 22, 2004 at 04:57:48PM -0800, Jorge Llamb?as wrote:
> > > valfendi: -la mer. samo,as.
> > > pegbased: -la mer. -sa >mo,as<
> >
> > That's a known difference. valfendi is more permissive with vowel
> > groups in cmene and fuhivla.
>
> camxes is more permissive in cmene, though.
>
> valfendi: -zoi -fy booz. -fy -co -sa -zoi bar. baz. bar.
> pegbased: -zoi -fy >booz< -fy -co -sa -zoi bar. baz. bar.

That's valfendi being permissive again.

camxes is more permissive in cmavo: it should accept {miau}
which I believe valfendi rejects.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 1912



> How about this:
>
> valfendi: (soindebi) (soindebytsi) (betysoindebytsi) (betysoindebi)
> pegbased: (soindebi) (soindebytsi) (betysoindebytsi) -be -ty (soindebi)

That was a bug in pegbased. medial-rafsi needed a !fuhivla in front.
Hopefully that fixes it.

mu'o mi'e xorxes





__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 1912


> Jorge Llamb��)B�as scripsit:
> > The question is: What about cmene that follow a syllable
> > {la} that is not a cmavo. Do they have to begin with a pause
> > or not?
>
> I guess the answer is that it doesn't matter much, since such a situation
> is always going to be a syntax error anyhow.

Not always:

lo'u stela vison le'u cu lojbo valsi

That should not be a syntax error.

I agree it doesn't matter much. I just don't want to call the split
of a cmene after {stela} a bug.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 1912



> Jorge Llamb��)B�as scripsit:
>
> > In cmene and lujvo: iy, uy and yy are added.
>
> I assume you mean cmavo rather than lujvo.

Yes.

> "yy" should not be allowed;
> it has no defined pronunciation.

That's a concesion to usage. People tend to write yyyyyy
for long hesitations.

We can define that y+ is equivalent to y for all purposes.

> > It would be relatively easy to forbid vowel triples everywhere. We just
> > add "!(vowel-y vowel-y)" at the end of the vowel rules.
>
> I believe we should do so. This is a break with the past, but a modest one.

I tend to agree. valfendi does this for cmavo already, but not
for cmene and fuhivla.

> > I don't see any reason to make cmene different from cmavo as far
> > as vowels are concerned.
>
> The point is that iy and uy are reserved for a morphological mechanism.
> Can arbitrary fu'ivla lujvo be constructed using iy after an initial
> fu'ivla, iy before and after a medial fu'ivla, and iy before a final
> fu'ivla? That was the original design.

Yes, for all fuhivla that don't start with a vowel.

(For those that do start with a vowel, we would need to allow
iy+vowel which I don't think is acceptable.)

The design I had in mind was for iy to be used only after fuhivla
to form its rafsi, but I see that allowing it after normal rafsi
would have its advantages too. Another possibility would be to not
allow it after normal rafsi but allow it to give a rafsi to every
cmavo!

> As I said before, allowing iy and uy in cmene even when they are reserved
> otherwise is safe because cmene are defined backwards.

In PEG they are defined forwards, but it doesn't matter anyway.

If iy is allowed after normal rafsi, or to give rafsi to
every cmavo, then it should not be allowed in cmavo.
That's a good reason.

mu'o mi'e xorxes





__
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com


posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> > It all depends on how it was supposed to work. fu'ivla-rafsi
> > can't be combined with all normal rafsi: the immediately preceding
> > one has to be CVCy-, CCVCy-, CVCCy- or another fu'ivla-rafsi.
>
> But when iy is the glue?

I hadn't thought of allowing {iy} after normal rafsi.
We should not allow it after {CVC}. I'm not sure
if ambiguities would result, but the whole point of using
the "i" is to attach to a vowel. CVC can take y directly.
(Same for CVCC and CCVC, of course.)

Instead of allowing iy after CVV rafsi, we could allow
it after any cmavo: that way every cmavo that starts with
a consonant could be used in lujvo in non-final position.

As a further refinement, if we don't allow consonant
triples we can use {'iy} as the hyphen after a diphthong:

pavyseljirna = paiyseljirna
pai zei seljirna = pai'iyseljirna

mu'o mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


posts: 1912


New idea: Forget about {-iy-} and use {-'y-} as the
general fuhivla hyphen. That won't interfere with
an eventual no-triple-vowels rule, and it won't
introduce diphthongs where there weren't any.
Also allow {-'y-} to give rafsi to cmavo:

pavyseljirna = pa'yseljirna

pai zei seljirna = pai'yseljirna

iglu zei xabju = iglu'yxa'u

{iy} could be used to give rafsi to cmene, but then
we should not allow it otherwise in cmene.

djan zei zdani = djaniyzda

mu'o mi'e xorxes




__
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com


Jorge Llamb��)B�as scripsit:

> That's a concesion to usage. People tend to write yyyyyy
> for long hesitations.

No problem.

> We can define that y+ is equivalent to y for all purposes.

No, no, no. We do not want people writing selylujvo as selyyyyyyyyyyyyyyyyylujvo.
If you want to say that the word y (or rather .y.) can be written with
multiple y's, fine. But not everywhere in the language!

--
John Cowan jcowan@reutershealth.com www.ccil.org/~cowan www.reutershealth.com
Linguistics is arguably the most hotly contested property in the academic
realm. It is soaked with the blood of poets, theologians, philosophers,
philologists, psychologists, biologists and neurologists, along with
whatever blood can be got out of grammarians. - Russ Rymer


Jorge Llamb��)B�as scripsit:

> New idea: Forget about {-iy-} and use {-'y-} as the
> general fuhivla hyphen. That won't interfere with
> an eventual no-triple-vowels rule, and it won't
> introduce diphthongs where there weren't any.
> Also allow {-'y-} to give rafsi to cmavo:

snip

> {iy} could be used to give rafsi to cmene, but then
> we should not allow it otherwise in cmene.

Even though all this can be made to work, I think we are better off forgetting
about it and sticking to zei. For one thing, you can pause around zei
to catch your breath.

--
Deshil Holles eamus. Deshil Holles eamus. Deshil Holles eamus.
Send us, bright one, light one, Horhorn, quickening, and wombfruit. (3x)
Hoopsa, boyaboy, hoopsa! Hoopsa, boyaboy, hoopsa! Hoopsa, boyaboy, hoopsa!
— Joyce, Ulysses, "Oxen of the Sun" jcowan@reutershealth.com


posts: 1912


> If you want to say that the word y (or rather .y.) can be written with
> multiple y's, fine. But not everywhere in the language!

OK. The way it is written now, only a single y is accepted for hyphens
in lujvo, but any number are accepted in cmavo and cmene.

I'll fix that.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 1912


>
> Even though all this can be made to work, I think we are better off
> forgetting
> about it and sticking to zei. For one thing, you can pause around zei
> to catch your breath.

I'm not proposing to eliminate zei but just to add more options.
Something like {'y} would be much easier to use than having to
work out on the fly whether a given fuhivla can have a rafsi or
not.

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 14214

More fun:

      • Sentence: yspatrkomposIta bYspatrkomposIta YspatrkomposIta lOspatrkomposIta 1

MISMATCH!
valfendi: -y (spatrkomposIta) -bY (spatrkomposIta) -Y (spatrkomposIta) -lO (spatrkomposIta)
pegbased: -y >spatrkomposIta< -bY >spatrkomposIta< -Y >spatrkomposIta< (lOspa) >trkomposIta<

      • Sentence: jAIspatrkomposIta lO'espatrkomposIta mu'eispatrkomposIta uaiuspatrkomposIta 1

MISMATCH!
valfendi: -jAI (spatrkomposIta) -lO'e (spatrkomposIta) -mu'ei (spatrkomposIta) >uaiu< (spatrkomposIta)
pegbased: (jAIspa) >trkomposIta< -lO'e >spatrkomposIta< (mu'eispatrkomposIta) (uaiuspatrkomposIta)

      • Sentence: spatrliliAce ispatrliliAce lespatrliliAce rauspatrliliAce 1

MISMATCH!
valfendi: (spatrliliAce) -i (spatrliliAce) -le (spatrliliAce) -rau (spatrliliAce)
pegbased: >spatrliliAce< (ispatrliliAce) (lespatrliliAce) (rauspatrliliAce)

-Robin


posts: 14214

On Wed, Dec 22, 2004 at 05:40:20PM -0800, Jorge Llamb?as wrote:
> > {kymoi}, {kybumoi}, {kybumlatu}, {lekymoi}, {lekybumoi}, and
> > {lekybumlatu} are more phrases with the pause after the lervla
> > missing. valfendi thinks they all contain brivla, but errors out
> > trying to identify it, except for {ky bu mlatu}.
>
> camxes should give:
>
> ky moi
> ky bu moi
> ky bu mlatu
> lekymoi (= lekmoi)
> le ky bu moi
> le ky bu mlatu

$ echo "kymoi kybumoi kybumlatu lekymoi lekybumoi and lekybumlatu" | myparser -m
Simple morphological breakdown requested.
Processing /dev/stdin ...
Morphology pass:
text
|- CMAVO
| BY: ky
|- CMAVO
| MOI: moi
|- spaces:
|- CMAVO
| BY: ky
|- CMAVO
| BU: bu
|- CMAVO
| MOI: moi
|- spaces:
|- CMAVO
| BY: ky
|- CMAVO
| BU: bu
|- BRIVLA
| gismu: mlatu
|- spaces:
|- BRIVLA
| lujvo: lekymoi
|- spaces:
|- CMAVO
| LE: le
|- CMAVO
| BY: ky
|- CMAVO
| BU: bu
|- CMAVO
| MOI: moi
|- spaces:
|- CMENE
| cmene: and
|- spaces:
|- CMAVO
| LE: le
|- CMAVO
| BY: ky
|- CMAVO
| BU: bu
|- BRIVLA
gismu: mlatu

-Robin


On Thursday 23 December 2004 16:49, Robin Lee Powell wrote:
> More fun:
>
> *** Sentence: yspatrkomposIta bYspatrkomposIta YspatrkomposIta
> lOspatrkomposIta 1 MISMATCH!
> valfendi: -y (spatrkomposIta) -bY (spatrkomposIta) -Y (spatrkomposIta) -lO
> (spatrkomposIta) pegbased: -y >spatrkomposIta< -bY >spatrkomposIta< -Y
> >spatrkomposIta< (lOspa) >trkomposIta<

Breaking {lospa} off is not necessarily a bug (valfendi already knows that
there's a brivla, not a cmene, in this text and sees that "trk" cannot begin
anything but a cmene, so it calls "lO" a secondary stress), but
{spatrkomposita} is a valid type-4.

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine


posts: 1912


> *** Sentence: yspatrkomposIta bYspatrkomposIta YspatrkomposIta
> pegbased: -y >spatrkomposIta< -bY >spatrkomposIta< -Y >spatrkomposIta<
> (lOspa) >trkomposIta<

I think it's fixed, but some rules look more ugly than they need to,
so I will probably be tinkering some more with it.

mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 14214

On Thu, Dec 23, 2004 at 09:49:49AM -0800, Jorge Llamb?as wrote:
>
> --- John Cowan wrote:
> > If you want to say that the word y (or rather .y.) can be
> > written with multiple y's, fine. But not everywhere in the
> > language!
>
> OK. The way it is written now, only a single y is accepted for
> hyphens in lujvo, but any number are accepted in cmavo and cmene.
>
> I'll fix that.

You broke it! :-)

$ echo "yyy" | myparser -m
text
|- nonLojbanWord: yy
|- spaces: y

-Robin


posts: 10

>> All brivla have the following properties:
>> 1) always end in a vowel;
>> 2) always contain a consonant pair in the first five letters, where "y"
>> and
>> apostrophe are not counted as letters for this purpose;
>> 3) always are stressed on the next-to-last (penultimate) syllable; this
>> implies that they have two or more syllables.
>>
>> I always assumed this to be definitive, rather than descriptive:
>
> That is true (I changed 2 slightly to "its second consonant is always
> part of a cluster", but the point is the same). Those are properties
> that all brivla have, but not everything with those properties is
> a brivla.

I think I see. The above is the definition of "brivla-form" (not "brivla").
A word that matches "brivla-form" may be a gismu, lujvo, fu'ivla, or
invalid. The are some simple validity tests that can be applied to a
brivla-form that do not require characterization, but full validation can't
be done without characterization.

But I still don't see why a word-partitioning parser, which has to deal with
more word-forms and therefore needs additional complexity, couldn't use the
simple brivla-form definition, and thereby reduce point-complexity.

Clark



posts: 1912


> I think I see. The above is the definition of "brivla-form" (not "brivla").
> A word that matches "brivla-form" may be a gismu, lujvo, fu'ivla, or
> invalid.

It may also be a cmavo+brivla ("tosmabru").

> The are some simple validity tests that can be applied to a
> brivla-form that do not require characterization, but full validation can't
> be done without characterization.

Right. And you can't be sure that you don't have a cmavo hiding there
either if you don't know what a lujvo is yet.

> But I still don't see why a word-partitioning parser, which has to deal with
> more word-forms and therefore needs additional complexity, couldn't use the
> simple brivla-form definition, and thereby reduce point-complexity.

There is no way that I know to partition a string into cmavo, brivla and
cmene without working out the details of brivla. If your string begins
with CV(V), you don't know if that is a cmavo or part of a brivla unless
you can figure out the brivla.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 1912


> On Thu, Dec 23, 2004 at 09:49:49AM -0800, Jorge Llamb?as wrote:
> >
> > --- John Cowan wrote:
> > > If you want to say that the word y (or rather .y.) can be
> > > written with multiple y's, fine. But not everywhere in the
> > > language!
> >
> > OK. The way it is written now, only a single y is accepted for
> > hyphens in lujvo, but any number are accepted in cmavo and cmene.
> >
> > I'll fix that.
>
> You broke it! :-)
>
> $ echo "yyy" | myparser -m
> text
> |- nonLojbanWord: yy
> |- spaces: y

Now I fixed what I broke while fixing the other part,
hopefully without breaking anything else. :-)

mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 14214

Please check the last few changes I made. It wasn't compiling. I'm
particularily unsure about the change to h.

-Robin



posts: 14214

On Fri, Dec 24, 2004 at 12:03:38PM -0800, Robin Lee Powell wrote:
> Please check the last few changes I made. It wasn't compiling.
> I'm particularily unsure about the change to h.

The fuhivla-rafsi-C change I'm a bit worried about to.

-Robin


posts: 14214

On Fri, Dec 24, 2004 at 12:05:33PM -0800, Robin Lee Powell wrote:
> On Fri, Dec 24, 2004 at 12:03:38PM -0800, Robin Lee Powell wrote:
> > Please check the last few changes I made. It wasn't compiling.
> > I'm particularily unsure about the change to h.
>
> The fuhivla-rafsi-C change I'm a bit worried about to.

Current behaviour:

      • Sentence: la kolombias 1

MISMATCH!
valfendi: -la kolombias.
pegbased: -la -ko -lo >mbias<

-Robin


posts: 14214

On Fri, Dec 24, 2004 at 12:06:41PM -0800, Robin Lee Powell wrote:
> On Fri, Dec 24, 2004 at 12:05:33PM -0800, Robin Lee Powell wrote:
> > On Fri, Dec 24, 2004 at 12:03:38PM -0800, Robin Lee Powell wrote:
> > > Please check the last few changes I made. It wasn't compiling.
> > > I'm particularily unsure about the change to h.
> >
> > The fuhivla-rafsi-C change I'm a bit worried about to.
>
> Current behaviour:
>
> *** Sentence: la kolombias 1
> MISMATCH!
> valfendi: -la kolombias.
> pegbased: -la -ko -lo >mbias<

Oh, this is bad:

      • Sentence: cinkrxomoptErata cinkrxomoptEragau cinkrxomoptEramu'ei cinkrxomoptEracu'i 1

MISMATCH!
valfendi: (cinkrxomoptEra) -ta (cinkrxomoptEra) -gau (cinkrxomoptEra) -mu'ei (cinkrxomoptEra) -cu'i
pegbased: (cinkrxomoptEra) -ta (cinkrxomoptEra) -gau (cinkrxomoptEra) -mu >'ei< (cinkrxomoptEra) -cu'i

-Robin


posts: 1912


> > > > Please check the last few changes I made. It wasn't compiling.

I made a number of changes and missed a few things, sorry.
But I think it's becoming more readable.

> > > > I'm particularily unsure about the change to h.

Hmmm, we still need to allow y after ' for y'y. I forgot to change
that when I eliminated vowel-y.

Probably the & is not needed after h, but it won't hurt for now.

> > > The fuhivla-rafsi-C change I'm a bit worried about to.

That was correct, thanks.

> > Current behaviour:
> >
> > *** Sentence: la kolombias 1
> > MISMATCH!
> > valfendi: -la kolombias.
> > pegbased: -la -ko -lo >mbias<

That was kind of on purpose. I'm disallowing i/u vowel except
initially, as per John's suggestion. I'm not sure that's such
a good idea.

> Oh, this is bad:
>
> *** Sentence: cinkrxomoptErata cinkrxomoptEragau cinkrxomoptEramu'ei
> cinkrxomoptEracu'i 1
> MISMATCH!
> valfendi: (cinkrxomoptEra) -ta (cinkrxomoptEra) -gau (cinkrxomoptEra) -mu'ei
> (cinkrxomoptEra) -cu'i
> pegbased: (cinkrxomoptEra) -ta (cinkrxomoptEra) -gau (cinkrxomoptEra) -mu
> >'ei< (cinkrxomoptEra) -cu'i

Is it still doing it? The cmavo and vowels section was where I was working
and I made several saves.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 149

Jorge Llamb?as scripsit:

> That was kind of on purpose. I'm disallowing i/u vowel except
> initially, as per John's suggestion. I'm not sure that's such
> a good idea.

  • scratches head*


I think I said that iV and uV (V = a,e,i,o,u) should only appear in
cmavo by themselves (ua but not *kua), whereas they can appear anywhere
in cmene or fu'ivla.

--
Income tax, if I may be pardoned for saying so, John Cowan
is a tax on income. --Lord Macnaghten (1901) cowan@ccil.org


posts: 1912


> Jorge Llamb?as scripsit:
>
> > That was kind of on purpose. I'm disallowing i/u vowel except
> > initially, as per John's suggestion. I'm not sure that's such
> > a good idea.
>
> *scratches head*
>
> I think I said that iV and uV (V = a,e,i,o,u) should only appear in
> cmavo by themselves (ua but not *kua), whereas they can appear anywhere
> in cmene or fu'ivla.

OK, I probably took something you said about cmavo to be about all
words. I don't understand the rationale for being so strict with
cmavo but not so with fu'ivla and cmene. It can't be about
pronounceability because while you disallow {kua broda}
you allow the fu'ivla {kuabroda} which is pronounced identically.

I'm going back to allowing (i/u) vowel as syllable core, including
in cmavo, unless there is a convincing reason to forbid this. It's
not hard to make the formal rule for cmavo different than for fu'ivla
and cmene, but it complicates matters for the user: you have to
remember that some CVV sequences will break away from a following
lujvo and some won't, for example.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 14214

On Fri, Dec 24, 2004 at 12:34:04PM -0800, Jorge Llamb?as wrote:
> > Oh, this is bad:
> >
> > *** Sentence: cinkrxomoptErata cinkrxomoptEragau cinkrxomoptEramu'ei
> > cinkrxomoptEracu'i 1
> > MISMATCH!
> > valfendi: (cinkrxomoptEra) -ta (cinkrxomoptEra) -gau (cinkrxomoptEra) -mu'ei
> > (cinkrxomoptEra) -cu'i
> > pegbased: (cinkrxomoptEra) -ta (cinkrxomoptEra) -gau (cinkrxomoptEra) -mu
> > >'ei< (cinkrxomoptEra) -cu'i
>
> Is it still doing it?

No, although now it is very much unlike valfendi's answer:

text
|- nonLojbanWord: cinkrxomoptErata
|- spaces:
|- nonLojbanWord: cinkrxomoptEragau
|- spaces:
|- nonLojbanWord: cinkrxomoptEramu'ei
|- spaces:
|- nonLojbanWord: cinkrxomoptEracu'i

-Robin


posts: 14214

cmaxes is really messed up:

      • Sentence: zo'onai li'a 1

MISMATCH!
valfendi: -zo'o -nai -li'a
pegbased: -zo'o -nai >li'a<

      • Sentence: ze'epuku da pinxe lo xalka ze'a le nunjbosla 1

MISMATCH!
valfendi: -ze'e -pu -ku -da (pinxe) -lo (xalka) -ze'a -le (nunjbosla)
pegbased: -ze'e >puku< -da (pinxe) -lo (xalka) -ze'a -le (nunjbosla)

      • Sentence: zi xruti le zdani 1

MISMATCH!
valfendi: -zi (xruti) -le (zdani)
pegbased: >zi< >xruti< -le >zdani<

-Robin


posts: 149

Jorge Llamb?as scripsit:

> I'm going back to allowing (i/u) vowel as syllable core, including
> in cmavo, unless there is a convincing reason to forbid this. It's
> not hard to make the formal rule for cmavo different than for fu'ivla
> and cmene, but it complicates matters for the user: you have to
> remember that some CVV sequences will break away from a following
> lujvo and some won't, for example.

I think you've given precisely the convincing reason. There is no shortage of
experimental cmavo, and complicating things for fu'ivla-creators is a Bad
Thing, as they are already complicated enough.

--
Long-short-short, long-short-short / Dactyls in dimeter,
Verse form with choriambs / (Masculine rhyme): cowan@ccil.org
One sentence (two stanzas) / Hexasyllabically http://www.reutershealth.com
Challenges poets who / Don't have the time. --robison who's at texas dot net


posts: 1912


> |- nonLojbanWord: cinkrxomoptErata
> |- spaces:
> |- nonLojbanWord: cinkrxomoptEragau
> |- spaces:
> |- nonLojbanWord: cinkrxomoptEramu'ei
> |- spaces:
> |- nonLojbanWord: cinkrxomoptEracu'i

Yes, when I put the "i/u vowel" back I messed up. The problem was
with the "i", cenkrxomoptErata parses correctly. I think I've fixed
that now.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 1912


> I think you've given precisely the convincing reason. There is no shortage
> of
> experimental cmavo, and complicating things for fu'ivla-creators is a Bad
> Thing, as they are already complicated enough.

We seem to have different ideas about what is complicated.
Allowing {kuabroda} but not {kaubroda} as a fu'ivla is
more complicated than saying that both cases are disallowed
for the same reason: they both break as cmavo + gismu.

Pierre, does valfendi accept {kuabroda} as a fu'ivla?

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 1912


Elidable terminators are sometimes required for disambiguation, but they are always allowed, even when not required.

Pauses between words are sometimes required for disambiguation, but they are always allowed, even when not required.

Marking stress with caps is sometimes required for disambiguation, but it is always allowed, even when not required.

Using long rafsi in lujvo instead of the short ones is sometimes required due to morphology constraints, but it is always allowed, even when not required.

There seems to be a pattern there. What about hyphens?

y- and r-hyphens after CVC and CVV rafsi are sometimes required due to morphology constrains, but they are always allowed, even when not required... NOT! When not required they are not allowed!

This seems to go against the Lojban way of doing things, and it is also a burden for the user. You get used to a lujvo like {tosymabru} and when you try to form a new lujvo by adding an additional rafsi:
say {naltosymabru} it turns out it is not valid: it has to be {naltosmabru}. You get used to {ro'inre'o} and when you form {braro'inre'o} it turns out that's not a lujvo, either.

In the PEG grammar, I allowed -y- after any CVC, not only when it is required, mainly because that was easier than disallowing it when it wasn't necessary. It doesn't seem to be worth complicating the grammar for such an unnecessary and bothersome restriction.

I am now allowing the r-hyphen after any non-final CVV as well, because that is more user-friendly. In this case there is a small cost: we take some forms that would otherwise be fu'ivla and put them in lujvo-space, but lujvo have always had priority over fu'ivla so that's not a big deal (and it is not a noticeable chunk of fu'ivla-space anyway).

These are also fairly common mistakes people make when creating lujvo, so this move is actually supported by usage.

mu'o mi'e xorxes


On Saturday 25 December 2004 18:36, Jorge "Llambías" wrote:
> We seem to have different ideas about what is complicated.
> Allowing {kuabroda} but not {kaubroda} as a fu'ivla is
> more complicated than saying that both cases are disallowed
> for the same reason: they both break as cmavo + gismu.
>
> Pierre, does valfendi accept {kuabroda} as a fu'ivla?

No. It breaks {kua} off and calls it an error.
>kua< (broda)

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine


posts: 149

wikidiscuss@lojban.org scripsit:

> In the PEG grammar, I allowed -y- after any CVC, not only when it is
> required, mainly because that was easier than disallowing it when it
> wasn't necessary. It doesn't seem to be worth complicating the grammar
> for such an unnecessary and bothersome restriction.
>
> I am now allowing the r-hyphen after any non-final CVV as well, because
> that is more user-friendly. In this case there is a small cost: we take
> some forms that would otherwise be fu'ivla and put them in lujvo-space,
> but lujvo have always had priority over fu'ivla so that's not a big deal
> (and it is not a noticeable chunk of fu'ivla-space anyway).

I actually support this, somewhat reluctantly, because we have already
so many different allo-lexes for lujvo, what's a few more — and it does
make errors less likely.

--
Said Agatha Christie / To E. Philips Oppenheim John Cowan
"Who is this Hemingway? / Who is this Proust? cowan@ccil.org
Who is this Vladimir / Whatchamacallum, http://www.reutershealth.com
This neopostrealist / Rabble?" she groused. http://www.ccil.org/cowan
--author unknown to me; any suggestions?


posts: 14214

Still:

      • Sentence: muSTEl,aVIson ESC[033m1ESC[000m

MISMATCH!
valfendi: >muSTE< -l,a VIson.
pegbased: -mu (STEl,a) VIson.

-Robin



On Sunday 26 December 2004 03:13, Robin Lee Powell wrote:
> Still:
>
>
> *** Sentence: muSTEl,aVIson ESC[033m1ESC[000m
> MISMATCH!
> valfendi: >muSTE< -l,a VIson.
> pegbased: -mu (STEl,a) VIson.

What does camxes do with {mustelAvison}?

phma
--
Sans lunettes, je ne distingue même pas les odeurs...
-Les Perles de la médecine


posts: 1912



> What does camxes do with {mustelAvison}?

{mu steLAvi son}

Also {mustelalalatiTAtuvison} gives:

{mu stelalalatiTAtu vison}

camxes works left to right, so if it doesn't see a cmene to its
right it looks for other things. I suppose I could force requiring
a pause before a cmene that does not follow doi/la/lai/la'i, but
there doesn't seem to be any point to doing that.

mu'o mi'e xorxes






__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912


valfendi and camxes will also probably differ in how they
handle things like: {zoi STElamrtteladjan STEla}
and {zoi djan STElamrtteladjan}.

camxes will parse the fisrt one as {zoi STEla >mrtteladjan< STEla}
while the second one needs a closing delimiter {.djan.} for zoi.

I suspect valfendi won't like the first one, but will parse the
second one as {zoi djan STElamrtte la djan}.

These are very weird cases. I'm not sure how we should handle strings
that contain non-words but are connected, to the left or to the right,
without any spaces, with what seem to be lojban words. Are we allowed
to break such things from the left (as camxes does) or from the right
(as valfendi does)? Or should we require non-words to absorb anything
not separated with pauses?

mu'o mi'e xorxes





__
Do you Yahoo!?
Jazz up your holiday email with celebrity designs. Learn more.
http://celebrity.mail.yahoo.com


On Sunday 26 December 2004 12:39, Jorge "Llambías" wrote:
> valfendi and camxes will also probably differ in how they
> handle things like: {zoi STElamrtteladjan STEla}
> and {zoi djan STElamrtteladjan}.
>
> camxes will parse the fisrt one as {zoi STEla >mrtteladjan< STEla}
> while the second one needs a closing delimiter {.djan.} for zoi.
>
> I suspect valfendi won't like the first one, but will parse the
> second one as {zoi djan STElamrtte la djan}.

valfendi currently lexes them as "-zoi >STElamrtte< -la djan. (STEla)" and
"-zoi djan. >STElamrtte< -la djan.". I haven't written any magic word
handling yet, but what I think it will do is reject the first (because
"stelamrtte" isn't a Lojban word) and reject the second (because finding a
{zoi} and delimiter turns off lexing until it sees the delimiter again after
a pause).

phma
--
Mes règles mensuelles ont lieu une fois par an.
-Les Perles de la médecine


On Sunday 26 December 2004 11:46, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > What does camxes do with {mustelAvison}?
>
> {mu steLAvi son}
>
> Also {mustelalalatiTAtuvison} gives:
>
> {mu stelalalatiTAtu vison}

valfendi says {muste LA vison} and {mu stelala la tiTAtuvison}.

> camxes works left to right, so if it doesn't see a cmene to its
> right it looks for other things. I suppose I could force requiring
> a pause before a cmene that does not follow doi/la/lai/la'i, but
> there doesn't seem to be any point to doing that.

I think we should just say that if a string is valid except for a required
pause, the two lexers may give valid output, even different valid output. You
may want to list where you allow omitting pauses that the official rules
require.

phma
--
GCS/M d- s-: a+ C++ UL++++$ P+ L+++ E- W+++ N+ o? K? w-- O? M- V- Y++
PGP++ t- 5? X? R- !tv b++ DI !D G e++ h+>---- r- y>+++


posts: 1912


> > > What does camxes do with {mustelAvison}?
> > {mu steLAvi son}
> > Also {mustelalalatiTAtuvison} gives:
> > {mu stelalalatiTAtu vison}
>
> valfendi says {muste LA vison} and {mu stelala la tiTAtuvison}.

So it will hear the brivla {muste} and {mustelala} even when they
are not penultimately stressed?

> > camxes works left to right, so if it doesn't see a cmene to its
> > right it looks for other things. I suppose I could force requiring
> > a pause before a cmene that does not follow doi/la/lai/la'i, but
> > there doesn't seem to be any point to doing that.
>
> I think we should just say that if a string is valid except for a required
> pause, the two lexers may give valid output, even different valid output.

That's more or less what happens now, yes.

> You
> may want to list where you allow omitting pauses that the official rules
> require.

What I would like is for camxes to agree exactly with the official
rules, either by modifying camxes or by modifying the official rules,
whichever we agree is better.

I have now added !non-lojban-word at the end of post-word.
This means that camxes will require a pause before and after
any non-lojban word. It won't break {mutt} as {mu >tt<}
(which is what it was doing) nor will it break {tteladjan}
as {>tte< la djan} which is what valfendi does.

Now I have to figure out how to impose the rule that a pause
is required before a cmene unless preceded by the *words*
doi/la/lai/la'i, not just the syllables or a brivla containing
those syllables, which is the rule it follows now.

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 953

On Sun, 26 Dec 2004, Jorge Llamb=EDas wrote:

>
> valfendi and camxes will also probably differ in how they
> handle things like: {zoi STElamrtteladjan STEla}
> and {zoi djan STElamrtteladjan}.
>
> camxes will parse the fisrt one as {zoi STEla >mrtteladjan< STEla}
> while the second one needs a closing delimiter {.djan.} for zoi.
>
> I suspect valfendi won't like the first one, but will parse the
> second one as {zoi djan STElamrtte la djan}.

IIRC there MUST be a pause before and after delimiter words, so it=20
shouldn't accept any of them.

--=20
Arnt Richard Johansen http://arj.nvg.org=
/
Let's have some real examples from a real, non-English language.


On Sunday 26 December 2004 14:34, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > valfendi says {muste LA vison} and {mu stelala la tiTAtuvison}.
>
> So it will hear the brivla {muste} and {mustelala} even when they
> are not penultimately stressed?

It breaks before {la}, and finding no stress in the piece assumes that the
brivla ends at the piece's end. If the stress is ultimate, as in {musTE}, it
calls it an error.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


posts: 14214
      • Sentence: la BALtazar. cu me le ci nolraitru 1

MISMATCH!
valfendi: -la BALtazar. -cu -me -le -ci (nolraitru)
pegbased: -la (BALta) zar. -cu -me -le -ci (nolraitru)

-Robin


posts: 14214

Here's the list of differences between valfendi and camxes:

http://www.teddyb.org/~rlpowell/media/regular/morph-auto-test.out.txt

I'll try to keep it up to date as both programs progress. It's
12425 lines.

In an ideal universe, xorxes and Pierre (and Nora, I suppose) would
get together and decide which of these were errors and which were
acceptable differences due to different styles of processing, out of
which would emerge a unified morphology.

Then those differences that remain would be mailed to me so I could
mark them for regression testing purposes.

-Robin



posts: 14214

On Tue, Dec 21, 2004 at 05:24:50PM -0800, Clark & Janiece Nelson
wrote:
> >>For the sake of modularity and reducing point-complexity, I
> >>think it would be worth considering splitting the job into its
> >>components, and writing separate grammars:
> >
> >The problem with this is that we could argue for hours over where
> >the seperations lie. I was vehemently opposed to seperating out
> >the morphology from the rest of the grammar in the first place,
> >in fact.
>
> Well, of course if one (very influential) partipant is "vehemently
> opposed" to any separation, then any proposal for separation would
> necessarily either be rejected immediately, or result in hours of
> argument. :-)

Indeed.

I feel it's worth stating *why* I'm opposed.

I don't want people to need to understand the divisions we're
creating to try to understand how the language works. I think that
even "morphology" versus "grammar" is artificial and arbitrary, and
I don't think people should have to go to two places to get their
questions answered.

It's not really all that important, though.

-Robin


On Monday 27 December 2004 02:01, Robin Lee Powell wrote:
> Here's the list of differences between valfendi and camxes:
>
> http://www.teddyb.org/~rlpowell/media/regular/morph-auto-test.out.txt

First thing I see is that camxes is trying to lex lines beginning with number
signs, which in my test file are comments. Second is that you should be
running valfendi with the -a option, so that it would recognize cmevla such
as {laus} and {doi'as}.

{mi benji le brablolai la laus} is a sentence I made up to try to confuse the
lexer. It seems to have succeeded. :-)

More later. Time to sleep.

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


posts: 14214

On Mon, Dec 27, 2004 at 02:16:11AM -0500, Pierre Abbat wrote:
> On Monday 27 December 2004 02:01, Robin Lee Powell wrote:
> > Here's the list of differences between valfendi and camxes:
> >
> > http://www.teddyb.org/~rlpowell/media/regular/morph-auto-test.out.txt
>
> First thing I see is that camxes is trying to lex lines beginning
> with number signs, which in my test file are comments.

Yeah, I know. I figured it would be a test case like any other, but
it looks like valfending actually *drops* lines that start with #.

That's, umm, a rather idiosyncratic piece of behaviour.

> Second is that you should be running valfendi with the -a option,
> so that it would recognize cmevla such as {laus} and {doi'as}.

Fixed. Re-running.

-Robin


posts: 14214
      • Sentence: coi kOnsept 1

MISMATCH!
valfendi: -coi kOnsept.
pegbased: -coi (kOnse) pt.

      • Sentence: doi kOnsept coi pado 1

MISMATCH!
valfendi: -doi kOnsept. -coi -pa -do
pegbased: -doi (kOnse) pt. -coi -pa -do

-Robin



posts: 1912



> *** Sentence: la BALtazar. cu me le ci nolraitru 1
> MISMATCH!
> valfendi: -la BALtazar. -cu -me -le -ci (nolraitru)
> pegbased: -la (BALta) zar. -cu -me -le -ci (nolraitru)

I didn't have !cmene at the beginning of gismu, because
it wasn't needed for the words rule. That should be fixed.

I also added !cmene at the end of gismu, lujvo and fuhivla.
This will force a pause before any cmene not preceded by
the *cmavo* doi/la/lai/la'i, not just the syllables.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


On Monday 27 December 2004 02:25, Robin Lee Powell wrote:
> On Mon, Dec 27, 2004 at 02:16:11AM -0500, Pierre Abbat wrote:
> > First thing I see is that camxes is trying to lex lines beginning
> > with number signs, which in my test file are comments.
>
> Yeah, I know. I figured it would be a test case like any other, but
> it looks like valfending actually *drops* lines that start with #.
>
> That's, umm, a rather idiosyncratic piece of behaviour.

So how should comments in the test file be indicated?

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.


posts: 1912


> {mi benji le brablolai la laus} is a sentence I made up to try to confuse the
>
> lexer. It seems to have succeeded. :-)

camxes has no problem with {brablolai lalaus} or {braBLOlailalaus}.

But {brablolailalaus} is not a valid word. It is not a cmene
because it contains forbidden syllables. It does not start with
a brivla because there is no stressed syllable followed by an
unstressed syllable. I am not taking "la + cmene" as an indicator
of stress two syllables back. Should we add that as another
official way to represent stress in writing? The only stress
representations I accept currently are caps and a following
syllable followed by a space.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com


posts: 2388


<rlpowell@digitalkingdom.org> wrote:

> I don't want people to need to understand the
> divisions we're
> creating to try to understand how the language
> works. I think that
> even "morphology" versus "grammar" is
> artificial and arbitrary, and
> I don't think people should have to go to two
> places to get their
> questions answered.

I assume you mean that the distinction between
which parts of a parsing process are counted as
dealing with morphology and which with grammar
(syntax?) is arbitrary. the distinction between
morphology and syntax is at least a whole lot
less arbitrary.
>
> It's not really all that important, though.
>
> -Robin
>
>
>



posts: 1912

      • Sentence: jAIckAnkua lO'eckAnkua mu'eickAnkua uaiuckAnkua

MISMATCH!
valfendi: -jAI (ckAnkua) -lO'e (ckAnkua) -mu'ei (ckAnkua) >uaiu< (ckAnkua)
pegbased: (jAIckA) >nkua< -lO'e (ckAnkua) -mu'ei (ckAnkua) >uaiuckAnkua<

valfendi and camxes have different ideas on how to treat
two consecutive stressed syllables not separated by spaces.

camxes strongly favours left-to-right processing, so it
will take the first stress it runs into as the determining one.

I presume valfendi and camxes will agree that {jAIckankua}
breaks as (jAIcka) >nkua<.

mu'o mi'e xorxes





__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 2388

The traditional claim that a Lojban speech steam
can be uniquely partitioned into Lojban words
seems to be in trouble. the difficulties seem to
center on the "foreign" parts of the language,
cmevla and fuhivla — and lujvo insofar as they
impinge on the latter (though these last
questions seem to be getting solutions). Fuhivla
have always been a bit problematic as have cmene
in their relation both to their native languages
and to Lojban and various devices have come along
to deal with these problems, mainly restricted --
and often very complex — phonological patterns
and — for cmene at least — obligatory pauses.
These last have seemed impractical in actual
speech — people forget to make them and, when
they do, others fail to notice them as distinct
from nonsignificant pauses.

One possibility for relieving this latest problem
is to replace significant absences (pauses) by
significant presences, a unique sound or mark.
In the discussion of the morphological problems,
it turns out that the exact role of /iy/ and /uy/
is up for grabs (assuming they are allowed at
all). Thus, /uy/ could replace morphologically
obligatory pauses — a minimal utterance (well,
longer only than /y/, which is already dealt
with) could replace a troublesome pause. (Putting
/uy/ at the end of names is remeniscent of
Japanese postclitic "wa," though with a
different function and a m0re indistinct vowel.)

In the discussion of cmevla, one constant
complaint is the peculiar restriction agains
/doi/, /la/ and /lai/ occurring in the name even
when they are in the native original — poor Lila
Doyle! The solution usually made is to allow the
prohibited strings but to place an obligatory
pause before all names, so that the confusion
with the words {la, lai, doi} is prevented. Of
course, another obligatory pause (other than the
phonologically determined ones between final
vowel of one word and initial of the next) merely
extends the problem of obligatory pauses. So we
might again suggest that this pause become a
positive utterance, different from that for the
world final version and so /iy/. (The idea here
is that /coi iy djan/ is almost exactly the
pattern of /hiya John/.) /iy/ and /uy/ are
probably elidable in some cases, but that would
need to be investigated.

What can go between a /iy/ and a /uy/? Any
phonologically legal Lojban string speech string,
that is, one that contains no illegal vowel or
consonant clusters (nor /iy/ and /uy/ of course).

Not even a final consonant need be required,
though it might be for continuity's sake. But
this opens possibilities beyond dealing with
names; any foreign word of phrase suitably
Lojbanized can go in this space and be nativized,
not merely quoted. What happens between /iy/ and
/uy/ is not subject to further analysis beyond
whether it is phonologically permitted and the
whole is taken as a block. This block can be
used other than as the core of a cmene sumti.
In particular, it can be inserted as a unit what
is otherwise lujvo construction (with some
adjustments probably rerquired, but certainly
fewer restrictions than now are involved in
fuhivla — apparently just a glue between vowel
finals and /iy/ and /uy/ and vowel initials). I
think that the only limitation is that the block
can not be compound-final, which would mean that
the pattern of many fuhivla (which went agains
the usual Lojban modifier-modified anyhow) would
have to be changed to put the category last.

This is a radical suggestion, but it carries a
load of benefits. It does, on the other hand,
require a reworking of existing fuhivla, the cost
of which is not very clear at the moment.



On Monday 27 December 2004 10:06, Jorge "Llambías" wrote:
> *** Sentence: jAIckAnkua lO'eckAnkua mu'eickAnkua uaiuckAnkua
> MISMATCH!
> valfendi: -jAI (ckAnkua) -lO'e (ckAnkua) -mu'ei (ckAnkua) >uaiu< (ckAnkua)
> pegbased: (jAIckA) >nkua< -lO'e (ckAnkua) -mu'ei (ckAnkua) >uaiuckAnkua<
>
> valfendi and camxes have different ideas on how to treat
> two consecutive stressed syllables not separated by spaces.
>
> camxes strongly favours left-to-right processing, so it
> will take the first stress it runs into as the determining one.
>
> I presume valfendi and camxes will agree that {jAIckankua}
> breaks as (jAIcka) >nkua<.

Actually, since a word other than a cmene can't begin with "nk", it breaks
after {jAI}. Then it calls {ckankua} an error. I'd have to examine the code
to see why.

phma
--
Now I need a magnifier to find my eyeglasses!
-Les Perles de la médecine


posts: 1912


pc:
> The traditional claim that a Lojban speech steam
> can be uniquely partitioned into Lojban words
> seems to be in trouble. the difficulties seem to
> center on the "foreign" parts of the language,
> cmevla and fuhivla — and lujvo insofar as they
> impinge on the latter (though these last
> questions seem to be getting solutions).

They aren't so much difficulties as different positions
on how strict or permissive the morphology should be.
Once we decide that, the uniqueness of the partitioning
of the stream is not under threat.

The main differences in criteria seem to be:

1) How do we represent a stressed syllable?

The official prescription has: capital letters or a following
syllable followed by a space. valfendi also allows a following
syllable followed by doi/la/lai/la'i + cmene, camxes doesn't.

2) Do we allow stress in syllables that shouldn't have it?

valfendi allows some of these "secondary stresses" in brivla.
camxes allows the last syllable of a brivla to be marked as stressed.

3) Which vowel combinations are allowed?

camxes allows ai, au, ei, oi, (i/u) vowel in
cmavo, cmene and fu'ivla, and no other vowel combinations
anywhere.

valfendi allows any combination in cmene and fu'ivla,
but only ai, au, ei, oi in cmavo and (i/u) vowel only
as a single cmavo by itself.

camxes allows {iy} in cmene as the only vowel combination
with y. valfendi allows any combination with y in cmene
and fu'ivla.


> One possibility for relieving this latest problem
> is to replace significant absences (pauses) by
> significant presences, a unique sound or mark.
....
> What can go between a /iy/ and a /uy/? Any
> phonologically legal Lojban string speech string,
> that is, one that contains no illegal vowel or
> consonant clusters (nor /iy/ and /uy/ of course).

{la iy anything uy} is similar to
{la'o any-word anything any-word}, although this
last one requires pauses.

I suppose the initial {.iy} will require a glottal stop
too, otherwise {la iy ...} and {lai iy ...} would be
practically indistinguishable.

> whole is taken as a block. This block can be
> used other than as the core of a cmene sumti.
> In particular, it can be inserted as a unit what
> is otherwise lujvo construction (with some
> adjustments probably rerquired, but certainly
> fewer restrictions than now are involved in
> fuhivla — apparently just a glue between vowel
> finals and /iy/ and /uy/ and vowel initials). I
> think that the only limitation is that the block
> can not be compound-final, which would mean that
> the pattern of many fuhivla (which went agains
> the usual Lojban modifier-modified anyhow) would
> have to be changed to put the category last.

I am preparing an addition along these lines, by allowing
cmene-rafsi, which are just any cmene followed by -iy.
So for example the rafsi for {djan} would be {djaniy-}. These
rafsi, like fuhivla-rasi, can only be preceded by y-rafsi
(four-letter rafsi or CVC-y rafsi or fuhivla-rafsi or other
cmene-rafsi). And y has to be disallowed in cmene.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


On Monday 27 December 2004 10:58, John E Clifford wrote:
> The traditional claim that a Lojban speech steam
> can be uniquely partitioned into Lojban words
> seems to be in trouble. the difficulties seem to
> center on the "foreign" parts of the language,
> cmevla and fuhivla — and lujvo insofar as they
> impinge on the latter (though these last
> questions seem to be getting solutions). Fuhivla
> have always been a bit problematic as have cmene
> in their relation both to their native languages
> and to Lojban and various devices have come along
> to deal with these problems, mainly restricted --
> and often very complex — phonological patterns
> and — for cmene at least — obligatory pauses.
> These last have seemed impractical in actual
> speech — people forget to make them and, when
> they do, others fail to notice them as distinct
> from nonsignificant pauses.

The test phrases include stressed and unstressed cmavo preceding brivla
without a pause. Stressing a cmavo before a brivla without a pause is likely
to result in a different word division, even if the brivla is a lujvo:
/lojboJBEna/ is {lo jbojbena}, while /LOjboJBEna/ is {lojbo jbena}. That
camxes lexes /jAIckAnkua/ differently than valfendi isn't a big problem. I'm
more concerned about /LIXtenctain/, which camxes splits as {lixte nctain}.

phma
--
Mes règles mensuelles ont lieu une fois par an.
-Les Perles de la médecine


posts: 1912


> That
> camxes lexes /jAIckAnkua/ differently than valfendi isn't a big problem. I'm
> more concerned about /LIXtenctain/, which camxes splits as {lixte nctain}.

That has been fixed. The problem was that the gismu rule didn't
have !cmene in front, because words didn't need it.

mu'o mi'e xorxes





__
Do you Yahoo!?
Jazz up your holiday email with celebrity designs. Learn more.
http://celebrity.mail.yahoo.com


posts: 2388


wrote:

>
> pc:
> > The traditional claim that a Lojban speech
> steam
> > can be uniquely partitioned into Lojban words
> > seems to be in trouble. the difficulties
> seem to
> > center on the "foreign" parts of the
> language,
> > cmevla and fuhivla — and lujvo insofar as
> they
> > impinge on the latter (though these last
> > questions seem to be getting solutions).
>
> They aren't so much difficulties as different
> positions
> on how strict or permissive the morphology
> should be.
> Once we decide that, the uniqueness of the
> partitioning
> of the stream is not under threat.

This is of course exactly what puts the whole
under threat: the fact that these issues have not
been decided — nor do there seem to be
principled ways to decide them (except tidy
algorithms, which may be enough). The claim
appears to have been false for Lojban up to now
and the aim is to figure out how best to make it
true.

>
> The main differences in criteria seem to be:
>
> 1) How do we represent a stressed syllable?
>
> The official prescription has: capital letters
> or a following
> syllable followed by a space. valfendi also
> allows a following
> syllable followed by doi/la/lai/la'i + cmene,
> camxes doesn't.

This is not a real problem with the claim, only
with how to represent the speech stream. Is it
clear that once stress is represented we alsways
know what it signifies?

> 2) Do we allow stress in syllables that
> shouldn't have it?
>
> valfendi allows some of these "secondary
> stresses" in brivla.
> camxes allows the last syllable of a brivla to
> be marked as stressed.
>
> 3) Which vowel combinations are allowed?
>
> camxes allows ai, au, ei, oi, (i/u) vowel in
> cmavo, cmene and fu'ivla, and no other vowel
> combinations
> anywhere.
>
> valfendi allows any combination in cmene and
> fu'ivla,
> but only ai, au, ei, oi in cmavo and (i/u)
> vowel only
> as a single cmavo by itself.
>
> camxes allows {iy} in cmene as the only vowel
> combination
> with y. valfendi allows any combination with y
> in cmene
> and fu'ivla.
>
>
> > One possibility for relieving this latest
> problem
> > is to replace significant absences (pauses)
> by
> > significant presences, a unique sound or
> mark.
> ...
> > What can go between a /iy/ and a /uy/? Any
> > phonologically legal Lojban string speech
> string,
> > that is, one that contains no illegal vowel
> or
> > consonant clusters (nor /iy/ and /uy/ of
> course).
>
> {la iy anything uy} is similar to
> {la'o any-word anything any-word}, although
> this
> last one requires pauses.

Yes; the difference is only in terms of possible
roles (and the simpler markers).

> I suppose the initial {.iy} will require a
> glottal stop
> too, otherwise {la iy ...} and {lai iy ...}
> would be
> practically indistinguishable.

Of course; the glottal stop between vowels in
different words is phonologically automatic
(well, should be, though even here some speakers
manage to screw up by using variants that their
interlocutors don't recognize as pauses).

> > whole is taken as a block. This block can be
> > used other than as the core of a cmene
> sumti.
> > In particular, it can be inserted as a unit
> what
> > is otherwise lujvo construction (with some
> > adjustments probably rerquired, but certainly
> > fewer restrictions than now are involved in
> > fuhivla — apparently just a glue between
> vowel
> > finals and /iy/ and /uy/ and vowel initials).
> I
> > think that the only limitation is that the
> block
> > can not be compound-final, which would mean
> that
> > the pattern of many fuhivla (which went
> agains
> > the usual Lojban modifier-modified anyhow)
> would
> > have to be changed to put the category last.
>
>
> I am preparing an addition along these lines,
> by allowing
> cmene-rafsi, which are just any cmene followed
> by -iy.
> So for example the rafsi for {djan} would be
> {djaniy-}. These
> rafsi, like fuhivla-rasi, can only be preceded
> by y-rafsi
> (four-letter rafsi or CVC-y rafsi or
> fuhivla-rafsi or other
> cmene-rafsi). And y has to be disallowed in
> cmene.

Yes, this would work as well, though it leaves
the (merely practical perhaps) cmene problems
untouched and is slightly less general otherwise.
It maybe more feasible since less radical. It
certainly is desirable in some form or other.


posts: 2388



> On Monday 27 December 2004 10:58, John E
> Clifford wrote:
> > The traditional claim that a Lojban speech
> steam
> > can be uniquely partitioned into Lojban words
> > seems to be in trouble. the difficulties
> seem to
> > center on the "foreign" parts of the
> language,
> > cmevla and fuhivla — and lujvo insofar as
> they
> > impinge on the latter (though these last
> > questions seem to be getting solutions).
> Fuhivla
> > have always been a bit problematic as have
> cmene
> > in their relation both to their native
> languages
> > and to Lojban and various devices have come
> along
> > to deal with these problems, mainly
> restricted --
> > and often very complex — phonological
> patterns
> > and — for cmene at least — obligatory
> pauses.
> > These last have seemed impractical in actual
> > speech — people forget to make them and,
> when
> > they do, others fail to notice them as
> distinct
> > from nonsignificant pauses.
>
> The test phrases include stressed and
> unstressed cmavo preceding brivla
> without a pause. Stressing a cmavo before a
> brivla without a pause is likely
> to result in a different word division, even if
> the brivla is a lujvo:
> /lojboJBEna/ is {lo jbojbena}, while
> /LOjboJBEna/ is {lojbo jbena}. That
> camxes lexes /jAIckAnkua/ differently than
> valfendi isn't a big problem. I'm
> more concerned about /LIXtenctain/, which
> camxes splits as {lixte nctain}.

Well, the name problem is one thing I
particularly aimed to deal with. I am not sure
that this proposal helps for stress problem,
which like pauses require a kind of control over
the speech stream that most of us lack a lot of
the time. xorxes' suggestion to always use
hephens (well, his is not quite that but...)
goes some way to resolving that however: it works
in the given case at least.


Robin Lee Powell scripsit:

> I don't want people to need to understand the divisions we're
> creating to try to understand how the language works. I think that
> even "morphology" versus "grammar" is artificial and arbitrary, and
> I don't think people should have to go to two places to get their
> questions answered.

There's pretty good cross-linguistic and psycholinguistic evidence that
people do have separate morphology and syntax modules in their heads
(in particular, different kinds of mental problems can impair one but
not the other).

--
John Cowan http://www.ccil.org/~cowan jcowan@reutershealth.com
To say that Bilbo's breath was taken away is no description at all. There are
no words left to express his staggerment, since Men changed the language that
they learned of elves in the days when all the world was wonderful. --The Hobbit


posts: 1912


I'm trying to figure out what to do with misplaced
stress marks.

What does valfendi do with {lobroDABRODA.} and {loBRODABRODA.}?

I think camxes currently does -lo (broDABRO) -DA
and -lo (BRODA) (BRODA).

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 14214

On Mon, Dec 27, 2004 at 07:58:24AM -0500, Pierre Abbat wrote:
> On Monday 27 December 2004 02:25, Robin Lee Powell wrote:
> > On Mon, Dec 27, 2004 at 02:16:11AM -0500, Pierre Abbat wrote:
> > > First thing I see is that camxes is trying to lex lines
> > > beginning with number signs, which in my test file are
> > > comments.
> >
> > Yeah, I know. I figured it would be a test case like any other,
> > but it looks like valfending actually *drops* lines that start
> > with #.
> >
> > That's, umm, a rather idiosyncratic piece of behaviour.
>
> So how should comments in the test file be indicated?

You can indicate them however you like, but it's not the job of the
program being tested to ignore comments, it's the job of the test
harness to cause them to not be dealt with as necessary (which I
have now done). Having the program being tested know about the
format of comments in the test suite is, as I said, very
idiosyncratic.

-Robin


posts: 14214

On Mon, Dec 27, 2004 at 07:06:42AM -0800, Jorge Llamb?as wrote:
> I presume valfendi and camxes will agree that {jAIckankua} breaks
> as (jAIcka) >nkua<.

No.

Morphology pass:
text
|- BRIVLA
| fuhivla: jAIcka
|- nonLojbanWord: nkua

Versus:

$ echo "jAIckankua" | valfendi
-jAI >ckankua<

-Robin


posts: 14214

On Mon, Dec 27, 2004 at 10:59:35AM -0800, Jorge Llamb?as wrote:
>
> I'm trying to figure out what to do with misplaced stress marks.
>
> What does valfendi do with {lobroDABRODA.} and {loBRODABRODA.}?
>
> I think camxes currently does -lo (broDABRO) -DA and -lo (BRODA)
> (BRODA).

cmaxes:

-lo (broDABRO) -DA
-lo (BRODA) (BRODA)

valfendi:

-lo >broDABRODA<
-lo >BRODABRODA<

-Robin


posts: 1912


I just noticed that CLL says:

"Capital letters are used only to represent non-standard stress,
which can appear only in the representation of Lojbanized names."

So, in the *official* orthography, a brivla must always be
written with a following space. Capital letters should not
make any difference to the parser, because in cmene stress
is irrelevant to the parse, and in brivla they shouldn't be
allowed to change the normal reading of stress. They should
play the same role as commas: they can be used to help with
pronounciation, or to confuse if placed in the wrong place,
but they can't change anything. So {BROdada} would be the
fu'ivla {brodada}, pronounced {broDAda}, just written with
misleading capitalization. That would make the grammar rules
a bit simpler, as we wouldn't have to deal with weird stress
cases.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 953

On Mon, 27 Dec 2004, Jorge Llamb=EDas wrote:

> I just noticed that CLL says:
>
> "Capital letters are used only to represent non-standard stress,
> which can appear only in the representation of Lojbanized names."
>
> So, in the *official* orthography, a brivla must always be
> written with a following space.
> ...
> That would make the grammar rules
> a bit simpler, as we wouldn't have to deal with weird stress
> cases.

It would. But we have always claimed that Lojban can be unambiguously=20
resolved into words solely based on stress. Something should be able to=
=20
prove that, and if not the morphological parser, then what?

--=20
Arnt Richard Johansen http://arj.nvg.org=
/
Someone just called to say he loved you?!


posts: 1912


> On Mon, 27 Dec 2004, Jorge Llamb=EDas wrote:
> > So, in the *official* orthography, a brivla must always be
> > written with a following space.
> > ...
> > That would make the grammar rules
> > a bit simpler, as we wouldn't have to deal with weird stress
> > cases.
>
> It would. But we have always claimed that Lojban can be unambiguously=20
> resolved into words solely based on stress. Something should be able to=
> prove that, and if not the morphological parser, then what?

I'm not proposing to change any of that, I'm only talking of
the way stress is represented in writing.

This is what a machine taking notes in Lojban should do:
Write each syllable down as it comes, if you hear a pause,
write a space, if you hear a stressed syllable, write it
down, write the next syllable down (ignoring its stress)
and then write a space, even if you don't hear any pauses,
but if there is no pause mark it as a non-cmene-space,
because it won't work as a space for cmene.

That's the official orthography, which does not use capital
letters for anything that the parser needs. It does not use
commas for anything interesting either.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


On Monday 27 December 2004 14:34, Jorge "Llambías" wrote:
> I just noticed that CLL says:
>
> "Capital letters are used only to represent non-standard stress,
> which can appear only in the representation of Lojbanized names."
>
> So, in the *official* orthography, a brivla must always be
> written with a following space. Capital letters should not
> make any difference to the parser, because in cmene stress
> is irrelevant to the parse, and in brivla they shouldn't be
> allowed to change the normal reading of stress. They should
> play the same role as commas: they can be used to help with
> pronounciation, or to confuse if placed in the wrong place,
> but they can't change anything. So {BROdada} would be the
> fu'ivla {brodada}, pronounced {broDAda}, just written with
> misleading capitalization. That would make the grammar rules
> a bit simpler, as we wouldn't have to deal with weird stress
> cases.

In *written* Lojban, yes. But in spoken Lojban, there are no spaces, only the
occasional pause. The word-break algorithm takes a representation of a speech
stream, with capitals standing for stress, and figures out where the spaces
go.

phma
--
Sans lunettes, je ne distingue même pas les odeurs...
-Les Perles de la médecine


posts: 1912


> In *written* Lojban, yes. But in spoken Lojban, there are no spaces, only the
> occasional pause. The word-break algorithm takes a representation of a speech
> stream, with capitals standing for stress, and figures out where the spaces
> go.

Given a speech stream, there is an official Lojban orthography
to write it down, and it does not involve capital letters in any
interesting way. That's all I'm saying.

We are writing a parser for variant orthographies that are more
complex than the official one.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 2388


wrote:

>
> I just noticed that CLL says:
>
> "Capital letters are used only to represent
> non-standard stress,
> which can appear only in the representation of
> Lojbanized names."
>
> So, in the *official* orthography, a brivla
> must always be
> written with a following space. Capital letters
> should not
> make any difference to the parser, because in
> cmene stress
> is irrelevant to the parse, and in brivla they
> shouldn't be
> allowed to change the normal reading of stress.
> They should
> play the same role as commas: they can be used
> to help with
> pronounciation, or to confuse if placed in the
> wrong place,
> but they can't change anything. So {BROdada}
> would be the
> fu'ivla {brodada}, pronounced {broDAda}, just
> written with
> misleading capitalization. That would make the
> grammar rules
> a bit simpler, as we wouldn't have to deal with
> weird stress
> cases.

JCB often tought he spoke phonologically perfect
Loglan and that illusion has passed on. I'm glad
to see that some effort is being made to deal
with reality, even though the claim about the
analysis of a speech string was surely intended
only to be about perfectly articulated ones.
Note the /iy/-/uy/ suggested or some equivalent
takes care of the problem of cmene stress since
it will be legitmate wherever it occurs. The
misplaced brivla stress remains however, if we
doubt that we can alsways remember pause after
brivla any better than after cmene. {BROdada}
would presumably often appear to be {BROda da},
not a fuhivla at all. (I admit that I am not
clear how it gets to be a fuhivla anyhow but the
muddle that the rules for fuhivla generate is a
major reason for suggesting the cmene-in-a-brivla version.


posts: 2388



> On Monday 27 December 2004 14:34, Jorge
> "Llambías" wrote:
> > I just noticed that CLL says:
> >
> > "Capital letters are used only to represent
> non-standard stress,
> > which can appear only in the representation
> of Lojbanized names."
> >
> > So, in the *official* orthography, a brivla
> must always be
> > written with a following space. Capital
> letters should not
> > make any difference to the parser, because in
> cmene stress
> > is irrelevant to the parse, and in brivla
> they shouldn't be
> > allowed to change the normal reading of
> stress. They should
> > play the same role as commas: they can be
> used to help with
> > pronounciation, or to confuse if placed in
> the wrong place,
> > but they can't change anything. So {BROdada}
> would be the
> > fu'ivla {brodada}, pronounced {broDAda}, just
> written with
> > misleading capitalization. That would make
> the grammar rules
> > a bit simpler, as we wouldn't have to deal
> with weird stress
> > cases.
>
> In *written* Lojban, yes. But in spoken Lojban,
> there are no spaces, only the
> occasional pause. The word-break algorithm
> takes a representation of a speech
> stream, with capitals standing for stress, and
> figures out where the spaces
> go.

This may be violating the surface claim about
unambiguous segmentation. That would seem to
require only working with the pauses that
"actually" occur. Perhaps some part of the
process would involve indication which ones were
insignificant, mere hesitations and the like, but
adding pauses should not be part of the game
(though inferring a word juncture even without a
pause is OK).


posts: 1912


pc:
> Note the /iy/-/uy/ suggested or some equivalent
> takes care of the problem of cmene stress since
> it will be legitmate wherever it occurs.

cmene stress is not problematic. It can be safely
ignored because it doesn't change anything.

> The
> misplaced brivla stress remains however, if we
> doubt that we can alsways remember pause after
> brivla any better than after cmene.

Are we talking about misplaced stress in speaking
or misplaced capital letters in writing? The first
one is simply wrong: if you misplace the stress
in a brivla you end up saying something different.

Misplaced capital letters (in the official orthography)
will simply not indicate spoken stress, they are just
noise. In alternative orthographies, like the one we're
using, the rules are not always clear.

> {BROdada}
> would presumably often appear to be {BROda da},
> not a fuhivla at all.

In the official orthography, {BROdada} = {broDAda} =
= {brodada} = {BrOdADa} = ... since capital letters
are always irrelevant. They are all pronounced with
stress in the second syllable, and it's a fu'ivla.

In the (unofficial) orthography that we normally use,
{BROdada} is indeed {BROda da}. In this orthography
capital letters are significant, as they indicate
significant brivla stress. Both valfendi and camxes
deal with this more complex orthography.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


Arnt Richard Johansen scripsit:

> It would. But we have always claimed that Lojban can be unambiguously
> resolved into words solely based on stress. Something should be able to
> prove that, and if not the morphological parser, then what?

+1


--
"No, John. I want formats that are actually John Cowan
useful, rather than over-featured megaliths that http://www.ccil.org/~cowan
address all questions by piling on ridiculous http://www.reutershealth.com
internal links in forms which are hideously jcowan@reutershealth.com
over-complex." --Simon St. Laurent on xml-dev


Jorge Llamb��)B�as scripsit:
>
> I just noticed that CLL says:
>
> "Capital letters are used only to represent non-standard stress,
> which can appear only in the representation of Lojbanized names."
>
> So, in the *official* orthography, a brivla must always be
> written with a following space. Capital letters should not
> make any difference to the parser, because in cmene stress
> is irrelevant to the parse, and in brivla they shouldn't be
> allowed to change the normal reading of stress.

I'd say rather that in the official orthography capital letters are not
permitted except in cmene. In the practical orthography, capitals are
permitted but have no meaning in cmavo, and are permitted only on the
stressed syllable in brivla. In the extended orthography generated by a
hypothetical speech interpreter, capitals are used to mark all stresses
and drive the morphology algorithm.

> play the same role as commas: they can be used to help with
> pronounciation, or to confuse if placed in the wrong place,
> but they can't change anything. So {BROdada} would be the
> fu'ivla {brodada}, pronounced {broDAda}, just written with
> misleading capitalization.

No, I don't think that follows, any more than it follows that because
"#" has no meaning, "#bri#vla" is a valid way of writing "brivla"
in the official orthography.

--
John Cowan www.ccil.org/~cowan www.reutershealth.com jcowan@reutershealth.com
In might the Feanorians / that swore the unforgotten oath
brought war into Arvernien / with burning and with broken troth.
and Elwing from her fastness dim / then cast her in the waters wide,
but like a mew was swiftly borne, / uplifted o'er the roaring tide.
--the Earendillinwe


posts: 2388


wrote:

>
> pc:
> > Note the /iy/-/uy/ suggested or some
> equivalent
> > takes care of the problem of cmene stress
> since
> > it will be legitmate wherever it occurs.
>
> cmene stress is not problematic. It can be
> safely
> ignored because it doesn't change anything.

It appears to create pseudo brivla and
accoutrements in some of the examples cited. It
interferes with the usefully simple rule that
every stress (not marked for emphasis, anyhow) is
the penult of a brivla, which ends after the next
vowel (cluster?), pause or no pause. Taking
cmene out of play gets closer to having that rule
to use.

>
> > The
> > misplaced brivla stress remains however, if
> we
> > doubt that we can alsways remember pause
> after
> > brivla any better than after cmene.
>
> Are we talking about misplaced stress in
> speaking
> or misplaced capital letters in writing? The
> first
> one is simply wrong: if you misplace the stress
>
> in a brivla you end up saying something
> different.

OK, let us take the hard line on this and admire
the fact that people manage to understand (and
segment and parse) in spite of saying the wrong
thing for what they intended (I was talking about
misplaced stress in speaking — especially stress
where no stress is meant to be, but also stress
displaced from where it is meant to be. Writing
is just meant to report what is said.)

> Misplaced capital letters (in the official
> orthography)
> will simply not indicate spoken stress, they
> are just
> noise. In alternative orthographies, like the
> one we're
> using, the rules are not always clear.
>
> > {BROdada}
> > would presumably often appear to be {BROda
> da},
> > not a fuhivla at all.
>
> In the official orthography, {BROdada} =
> {broDAda} =
> = {brodada} = {BrOdADa} = ... since capital
> letters
> are always irrelevant. They are all pronounced
> with
> stress in the second syllable, and it's a
> fu'ivla.

This is orthography that already comes segmented?
That doesn't seem relevant to the question at
hand, which is — I think — to segment the
stream as it arrives. I still haven't found in
the Byzantine labyrinth how {brodada} is legal,
unless the penultimate stress — which is not
here apparent of course — is overwhelming. It
still — even with the stress — looks like a
misspelling).





posts: 1912


> I'd say rather that in the official orthography capital letters are not
> permitted except in cmene.

OK.

> In the practical orthography, capitals are
> permitted but have no meaning in cmavo, and are permitted only on the
> stressed syllable in brivla.

That wouldn't interfere with the parser.

> In the extended orthography generated by a
> hypothetical speech interpreter, capitals are used to mark all stresses
> and drive the morphology algorithm.

But the hypothetical speech interpreter could also use the official
orthography instead of using an extended one. This was my point. Any
speech stream can be written in the official orthography, and
anything written in the official orthography can be unambiguously
interpreted.

> > play the same role as commas: they can be used to help with
> > pronounciation, or to confuse if placed in the wrong place,
> > but they can't change anything. So {BROdada} would be the
> > fu'ivla {brodada}, pronounced {broDAda}, just written with
> > misleading capitalization.
>
> No, I don't think that follows, any more than it follows that because
> "#" has no meaning, "#bri#vla" is a valid way of writing "brivla"
> in the official orthography.

OK. That doesn't really affect my point.

mu'o mi'e xorxes





__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 1912


> OK, let us take the hard line on this and admire
> the fact that people manage to understand (and
> segment and parse) in spite of saying the wrong
> thing for what they intended (I was talking about
> misplaced stress in speaking — especially stress
> where no stress is meant to be, but also stress
> displaced from where it is meant to be. Writing
> is just meant to report what is said.)

That's admirable indeed.

But people probably use lots of semantic and
relevance cues not available to the simple-minded
parser we are discussing.

> > In the official orthography, {BROdada} =
> > {broDAda} =
> > = {brodada} = {BrOdADa} = ... since capital
> > letters
> > are always irrelevant. They are all pronounced
> > with
> > stress in the second syllable, and it's a
> > fu'ivla.
>
> This is orthography that already comes segmented?

No, cmavo need not come segmented. All that comes with this
orthography is penultimate stress. Anything that ends with
a space was either penultimately stressed or the space was
an actual pause. In this orthography there are no
non-penultimate written stresses outside of cmene.

> That doesn't seem relevant to the question at
> hand, which is — I think — to segment the
> stream as it arrives.

There are two steps here:

Step 1: Speech recognizer hears stream and writes it down using
some orthography.

Step 2: Parser takes written record (as it arrives,
but indefinite lookahead is required) and parses it.

Step 1 could use any orthography, of course. If Step 1 uses
the official orthography, then Step 2 is simplified, because
using the official orthography already does some work: it is
not just writing down phonemes, it also requires counting
some syllables.

> I still haven't found in
> the Byzantine labyrinth how {brodada} is legal,
> unless the penultimate stress — which is not
> here apparent of course — is overwhelming. It
> still — even with the stress — looks like a
> misspelling).

It is a fuhivla. Anything with CCVCVCV form (with permissible
initial cluster) is a fuhivla, simply because it is not a lujvo
and cannot fall apart as cmavo and lujvo.

mu'o mi'e xorxes




__
Do you Yahoo!?
Send a seasonal email greeting and help others. Do good.
http://celebrity.mail.yahoo.com


Jorge Llamb��)B�as scripsit:

> But the hypothetical speech interpreter could also use the official
> orthography instead of using an extended one.

I'm assuming (contrary to the actual facts about speech-interpretation
programs) that it knows nothing of the morphology and only produces
an orthographic representation of stress, pause, and the 24 phonemes of Lojban.

> Any speech stream can be written in the official orthography, and
> anything written in the official orthography can be unambiguously
> interpreted.

Yes.

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
If a soldier is asked why he kills people who have done him no harm, or a
terrorist why he kills innocent people with his bombs, they can always
reply that war has been declared, and there are no innocent people in an
enemy country in wartime. The answer is psychotic, but it is the answer
that humanity has given to every act of aggression in history. --Northrop Frye


posts: 2388


wrote:

>
> --- John Cowan wrote:
>
> > In the extended orthography generated by a
> > hypothetical speech interpreter, capitals are
> used to mark all stresses
> > and drive the morphology algorithm.
>
> But the hypothetical speech interpreter could
> also use the official
> orthography instead of using an extended one.
> This was my point. Any
> speech stream can be written in the official
> orthography, and
> anything written in the official orthography
> can be unambiguously
> interpreted.
>
This is a little unclear. Do you mean the
official orthography with all word divisions
already in place or just a string of letters
without stress marked and perhaps with all pauses
marked? That the former works is pretty near
trivial, except for the mass of problems that
surround whether fuhivla and their relation to
other brivla. And that does not seem to be
completely solved, though the end is perhaps in
sight. It will not however answer to the claim
made about unambiguous segmentation, since it
comes segmented. As far as I can tell, the real
question has only been dealt with in small parts
and under questionable assumptions about our
control of speech (saying "if they place a stress
wrong — or a pause — then they have made a
mistake and it does not deserve to be segmented"
is strictly correct but not very helpful). It is
not clear that even if we do assume that all
pauses shown are where they ought to be and all
pauses that ought to occur do and that stress is
always correctly placed, it still is not obvious
that every legitmate Lojban utterance can be
uniquely extracted --and every illegitmate one
rejected tout court. That is the challenge, the
rest is just quibble or trivia.


posts: 1912


> I'm assuming (contrary to the actual facts about speech-interpretation
> programs) that it knows nothing of the morphology and only produces
> an orthographic representation of stress, pause, and the 24 phonemes of
> Lojban.

Well, that's already knowing a little bit. If it can tell which
phonemes are vowels and diphthongs and it can count, then it can
use the official orthography to write down what it hears.

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > OK, let us take the hard line on this and
> admire
> > the fact that people manage to understand
> (and
> > segment and parse) in spite of saying the
> wrong
> > thing for what they intended (I was talking
> about
> > misplaced stress in speaking — especially
> stress
> > where no stress is meant to be, but also
> stress
> > displaced from where it is meant to be.
> Writing
> > is just meant to report what is said.)
>
> That's admirable indeed.
>
> But people probably use lots of semantic and
> relevance cues not available to the
> simple-minded
> parser we are discussing.

True. But the question is can the simple-minded
parser do the job even if everything is done
right?

> > > In the official orthography, {BROdada} =
> > > {broDAda} =
> > > = {brodada} = {BrOdADa} = ... since capital
> > > letters
> > > are always irrelevant. They are all
> pronounced
> > > with
> > > stress in the second syllable, and it's a
> > > fu'ivla.
> >
> > This is orthography that already comes
> segmented?
>
> No, cmavo need not come segmented. All that
> comes with this
> orthography is penultimate stress. Anything
> that ends with
> a space was either penultimately stressed or
> the space was
> an actual pause. In this orthography there are
> no
> non-penultimate written stresses outside of
> cmene.

So it is assumed that all obligatory pauses are
marked already and all obligatory stresses. Are
nonobligatory pauses that occur also marked? What
about emphatic stress? (which always comes, I
suppose after some cmavo, I forget which)? Is
stress in cmene marked? Given the right answers
to these questions (there are probably others) it
would seem that finishing the segmentation should
be pretty straightforward. But it is not obvious
that the present rules do even this job.

> > That doesn't seem relevant to the question
> at
> > hand, which is — I think — to segment the
> > stream as it arrives.
>
> There are two steps here:
>
> Step 1: Speech recognizer hears stream and
> writes it down using
> some orthography.
>
> Step 2: Parser takes written record (as it
> arrives,
> but indefinite lookahead is required) and
> parses it.

Parsing isn't needed for the present case,
segmentation of the string — or rejection of the
whole — is all that is required. Incidentally,
I suppose we ought to throw in that the unique
segmentation is the right one, matching the input
to the speech generator.

> Step 1 could use any orthography, of course.

Not quite; it has to show stress, which the
official orthography apparently does not. And,
if the official orthography includes period for
obligatory pauses then I doubt that a
transcription prior to analysis can do this part.
> If
> Step 1 uses
> the official orthography, then Step 2 is
> simplified, because
> using the official orthography already does
> some work: it is
> not just writing down phonemes, it also
> requires counting
> some syllables.

As far as segmentation goes, the official
orthography does almost all the work, leaving
only possibly dividing up cmavo strings. The
question then is how to get the transcriber to
come put in offical orthography.

> > I still haven't found in
> > the Byzantine labyrinth how {brodada} is
> legal,
> > unless the penultimate stress — which is not
> > here apparent of course — is overwhelming.
> It
> > still — even with the stress — looks like a
> > misspelling).
>
> It is a fuhivla. Anything with CCVCVCV form
> (with permissible
> initial cluster) is a fuhivla, simply because
> it is not a lujvo
> and cannot fall apart as cmavo and lujvo.

And stress on the penult, of course. This
doesn't help much for the present problem, since
we only know that the stress is there because we
know where the end is and that it is a brivla --
two of the things the algorithm is supposed to
find out.
I think that the requirement that the
transcription be in offical orthography gives too
much away, things which the hearer does not have,
and which are thus not givens in the claim.


posts: 2388



> Jorge Llamb��)B�as scripsit:
>
> > But the hypothetical speech interpreter could
> also use the official
> > orthography instead of using an extended one.
>
>
> I'm assuming (contrary to the actual facts
> about speech-interpretation
> programs) that it knows nothing of the
> morphology and only produces
> an orthographic representation of stress,
> pause, and the 24 phonemes of Lojban.
>
> > Any speech stream can be written in the
> official orthography, and
> > anything written in the official orthography
> can be unambiguously
> > interpreted.
>
> Yes.

Well, we hope so but it is not clear that
anything we have so far can do it.


Jorge Llamb��)B�as scripsit:

> > I'm assuming (contrary to the actual facts about speech-interpretation
> > programs) that it knows nothing of the morphology and only produces
> > an orthographic representation of stress, pause, and the 24 phonemes of
> > Lojban.
>
> Well, that's already knowing a little bit. If it can tell which
> phonemes are vowels and diphthongs and it can count, then it can
> use the official orthography to write down what it hears.

It also has to know the rules about cmene. I was, again, thinking of
an entirely stupid system that generates nothing but a-z, AEIOU, ', and .

(Such stupid systems don't actually exist in the Real World; speech transcribers
have to know quite a lot about the morphology, and even about the syntax,
in order to achieve even tolerably low error rates.)

--
Income tax, if I may be pardoned for saying so, John Cowan
is a tax on income. --Lord Macnaghten (1901) jcowan@reutershealth.com


posts: 2388


wrote:

>
> --- John Cowan wrote:
> > I'm assuming (contrary to the actual facts
> about speech-interpretation
> > programs) that it knows nothing of the
> morphology and only produces
> > an orthographic representation of stress,
> pause, and the 24 phonemes of
> > Lojban.
>
> Well, that's already knowing a little bit. If
> it can tell which
> phonemes are vowels and diphthongs and it can
> count, then it can
> use the official orthography to write down what
> it hears.
>
This claim seems to be at the crux here. Can
anything we have now actually do this: go from
speech stream to offical orthography always
correctly (including rejecting what is not right)?


posts: 2388



> Jorge Llamb��)B�as scripsit:
>
> > > I'm assuming (contrary to the actual facts
> about speech-interpretation
> > > programs) that it knows nothing of the
> morphology and only produces
> > > an orthographic representation of stress,
> pause, and the 24 phonemes of
> > > Lojban.
> >
> > Well, that's already knowing a little bit. If
> it can tell which
> > phonemes are vowels and diphthongs and it can
> count, then it can
> > use the official orthography to write down
> what it hears.
>
> It also has to know the rules about cmene. I
> was, again, thinking of
> an entirely stupid system that generates
> nothing but a-z, AEIOU, ', and .
>
> (Such stupid systems don't actually exist in
> the Real World; speech transcribers
> have to know quite a lot about the morphology,
> and even about the syntax,
> in order to achieve even tolerably low error
> rates.)

Since the input here is to be to the morphology
and syntax processors, this seems to mean that an
existing speech interpreter could not generally
do the job here being asked of it. But let us
assume that we start a step along from that,
with a certified correct transcription in speech
stream for, noting only Lojban phonemes, pauses
and stresses. Given that input, can we always
get either a unique (and correct) segmentation or
a rejection (also correct)?


posts: 1912


The current morphology rules say this is a valid fu'ivla:

tstststikptkptsrzgbdbgu

Fortunately, nobody has thought of putting it in jbovlaste yet.

I think the current rules are too permissive for consonant
clusters. I would like to propose the following restriction:

"Normal" fu'ivla will not have any cluster worse than C/CC
(those are the worst that can appear in lujvo), i.e. no
four-consonant cluster allowed, and no three-consonant
cluster unless the second pair is a permissible initial.

Type-three fu'ivla will be of the form:

CVCC-r-normal
CCVC-r-normal
CVC-r-normal

i.e. the crunchy part of type three fu'ivla will be the only
exception to the nothing worse than C/CC rule.

There is only one fu'ivla in jbovlste that doesn't follow
this rule, namely "tarksako". If we wanted to allow such
things, we could include a few more pairs as "permissible
syllable initial", such as perhaps ks, ps, jn, zn and a few
others that were left out as permissible word initials.

Or do we just allow {tstststikptkptsrzgbdbgu} as a valid
fuhivla?

Opinions?

mi'e xorxes

posts: 14214

On Tue, Jan 04, 2005 at 09:49:35AM -0800, wikidiscuss@lojban.org wrote:
> Or do we just allow {tstststikptkptsrzgbdbgu} as a valid fuhivla?

YE GODS NO!

I'm a long-time Lojbanist. It's been Quite A Whlie since any word
has stumped me, and I refuse to use buffer vowels. I cannot
pronounce that word. It's *BAD*.

-Robin


posts: 953

On Tue, 4 Jan 2005 wikidiscuss@lojban.org wrote:

> Or do we just allow {tstststikptkptsrzgbdbgu} as a valid
> fuhivla?

I think we should allow it.

Any language will have at least some tongue-twisters, Lojban is no
exception. The discretion taken by the coiners of fu'ivla should be enough
to save us from such unpronouncable words ever getting widespread.

In any case, I think it is unadvisable to make the already complex
morphology algorithm even more complex to protect us against this sort of
thing.

--
Arnt Richard Johansen http://arj.nvg.org/
Tusener p tusener av nydelige lpenoter - og hvilenoter!


posts: 14214

On Tue, Jan 04, 2005 at 10:30:15PM +0100, Arnt Richard Johansen wrote:
> On Tue, 4 Jan 2005, Robin Lee Powell wrote:
> X-ecartis-version: Ecartis v1.0.0
> Sender: wikidiscuss-bounce@lojban.org
> Errors-to: wikidiscuss-bounce@lojban.org
> X-original-sender: arj@nvg.org
> Precedence: bulk
> Reply-to: wikidiscuss-list@lojban.org
> X-list: wikidiscuss

I have no idea what the problem is there, but I've put some
debugging stuff in so hopefully I'll catch it next itme.

> > On Tue, Jan 04, 2005 at 09:49:35AM -0800, wikidiscuss@lojban.org
> > wrote:
> >> Or do we just allow {tstststikptkptsrzgbdbgu} as a valid
> >> fuhivla?
> >
> > YE GODS NO!
> >
> > I'm a long-time Lojbanist. It's been Quite A Whlie since any
> > word has stumped me, and I refuse to use buffer vowels. I
> > cannot pronounce that word. It's *BAD*.
>
> It would be interesting to see some restrictions that can rule out
> such words.

Umm, how about "You can't have four consonants in a row, dumb ass"?
Seems pretty fucknig simple to me.

> And bear this in mind: you may think imposing additional
> restrictions is a good idea now. But a few years down the line,
> you may be trying to construct a new fu'ivla that contain some
> consonant cluster that has been made illegal. Then you'll hate it
> as much as you hate the "no la/lai/doi in cmene" rule now.

I don't want to outlaw particular clusters, I want to outlaw
clusters of particular *lengths*.

-Robin


On Tuesday 04 January 2005 12:49, wikidiscuss@lojban.org wrote:
> Re: PEG Morphology Algorithm
>
> The current morphology rules say this is a valid fu'ivla:
>
> tstststikptkptsrzgbdbgu
>
> Fortunately, nobody has thought of putting it in jbovlaste yet.

jbovlaste, AFAIK, is still using vlatai, which doesn't allow that word.

> I think the current rules are too permissive for consonant
> clusters. I would like to propose the following restriction:
>
> "Normal" fu'ivla will not have any cluster worse than C/CC
> (those are the worst that can appear in lujvo), i.e. no
> four-consonant cluster allowed, and no three-consonant
> cluster unless the second pair is a permissible initial.
>
> Type-three fu'ivla will be of the form:
>
> CVCC-r-normal
> CCVC-r-normal
> CVC-r-normal
>
> i.e. the crunchy part of type three fu'ivla will be the only
> exception to the nothing worse than C/CC rule.
>
> There is only one fu'ivla in jbovlste that doesn't follow
> this rule, namely "tarksako". If we wanted to allow such
> things, we could include a few more pairs as "permissible
> syllable initial", such as perhaps ks, ps, jn, zn and a few
> others that were left out as permissible word initials.
>
> Or do we just allow {tstststikptkptsrzgbdbgu} as a valid
> fuhivla?

We should allow it. We had an argument about {damskrima}, a perfectly
pronounceable and valid (though ill-concieved; meant {dambrskrima}) fu'ivla
which vlatai rejects. The rule vlatai goes by is that no string of four
consonants can lack a vocalic consonant; it didn't consider 'r' to be a
vocalic consonant because there is a vowel next to it. It considers
{damskrtima} to be valid.

Words in other languages have all sorts of consonant clusters. We should not
disallow them in fu'ivla except those that are forbidden in all words (such
as "ntcb") and initially those that are forbidden initially in all brivla. We
might want the word {zajbrmtsfrtneli}, for instance. (I leave it to our
Kartuli expert to define that.)

phma
--
li fi'u vu'u fi'u fi'u du li pa


On Wednesday 05 January 2005 08:43, Jorge "Llambías" wrote:
> --- Arnt Richard Johansen wrote:
> > It would be interesting to see some restrictions that can rule out such
> > words.
>
> One restriction I would want on initial clusters, for example:
>
> Instead of allowing "d-t sibilant" I would allow just
> "d-t sibilant !consonant"
>
> That would mean {dz, dj, ts, tc} cannot be followed by a consonant
> in initial position. That rules out things like {tcna}, {tspe},
> {djmo}, {dzgo} which are all currently permissible initial clusters.

Any change in allowed initial clusters will change the lexing of words. For
instance, /ledZGOmfre/ is currently {le dzgomfre}, and {ledzgomfre} is not a
word. If "dzg" were banned from initial position, /ledZGOmfre/ would be
{ledzgomfre}.

phma
--
le xruki le ginxre xrixruba xu xrula cu xrani?


Pierre Abbat scripsit:

> Any change in allowed initial clusters will change the lexing of words. For
> instance, /ledZGOmfre/ is currently {le dzgomfre}, and {ledzgomfre} is not a
> word. If "dzg" were banned from initial position, /ledZGOmfre/ would be
> {ledzgomfre}.

For this and other reasons, I've reluctantly decided to oppose changes
to the fu'ivla rules. In the domain of syntax, I think it's fairly
useful to prohibit silliness if it's easy to do so, because syntax is
generative: ordinary people speaking the language make up their own
sentences all the time. Word-creation is not so common, and will
become increasingly less so as time goes on; therefore, I think we
can leave it to common sense and common usage to drop any difficult
fu'ivla that someone has the bad taste to introduce.

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
Big as a house, much bigger than a house, it looked to Sam, a grey-clad
moving hill. Fear and wonder, maybe, enlarged him in the hobbit's eyes,
but the Mumak of Harad was indeed a beast of vast bulk, and the like of him
does not walk now in Middle-earth; his kin that live still in latter days are
but memories of his girth and his majesty. --"Of Herbs and Stewed Rabbit"


Re: PEG Morphology Algorithm

I have the rule:

CVV-rafsi



Re: PEG Morphology Algorithm

I'm allowing a "y" after any CVC-rafsi no matter what consonant
follows. So I allow {selyma'o} as well as {selma'o}.

I'm not sure if there was a rule against this, but the restriction
is not required for unambiguity, and implementing it would
complicate the rules enormously, so I'm not doing it.

mu'o mi'e xorxes




Re: PEG Morphology Algorithm

I'm implementing stress marking as follows:

1- Commas are ignored always. So for example BRAli,e is identical to BRAlie and is a valid fuhivla.

2- Case of all consonants is ignored. BRoDa = broda

3- Case is ignored in both cmene and cmavo, because stress is irrelevant for them. {PApiPEtis} is a valid cmene and {la'E'Au} is a valid cmavo form.

4- iV, uV, ai, au, ei, oi are the only vowel pairs allowed. Other sequences give "no-lojban-word". Strings like aiaueiaii are allowed as long as every adjacent pair in them is allowed.

5- Vowel strings are broken in pairs from the left for purposes of counting syllables: ai-au-ei-ai-i has five syllables.

6- Stress on a diphthong is shown by capitalizing the first vowel in ai, au, ei, oi, and the second vowel in iV, uV. The other member of the diphthong is treated as a consonant, i.e. its case is ignored. {Ia} is considered an unstressed syllable, just like {Ba}. {iA} is stressed, like {bA}.

7- Words with wrong stress patterns such as {broDA} or {brIvlA} produce "non-lojban-word".

Comments?

mu'o mi'e xorxes




Re: PEG Morphology Algorithm

I have now added handling of rafsi fuhivla, so except for some minor adjustments I will probably have to do, the morphology PEG is basically ready. Anyone wants to test it?

mu'o mi'e xorxes




Re: PEG Morphology Algorithm
Bug number 2:

Morphology pass: text=( CMAVO=( SU=( s=( s ) u=( u ) ) ) nonMorphLojbanMorphWord=( 'i ) )

-Robin



Re: PEG Morphology Algorithm
This:

cmavo



Re: PEG Morphology Algorithm

I changed cmene-syllaboid to:

cmene-syllaboid



Re: PEG Morphology Algorithm

Every fu'ivla that starts with a consonant can be used
as the final rafsi of a lujvo.

Given that {i} is permissible after any vowel, and that {iy}
is a valid vowel pair, we could give every fuhivla that starts
with a consonant a medial rafsi if we use -iy- as the hyphen.
(This could be in addition to the priviledged fuhivla that
have shorter rafsi, so {tci'ile} for example would have both
tci'ily- and tci'ileiy- as rafsi, just as {valsi} has valsy- and
val-. This would be easy to implement.

fuhivla that start with a vowel still need to start with a pause,
so they can form lujvo non-initially only with zei.

mu'o mi'e xorxes




Re: PEG Morphology Algorithm — design
Jorge, this is just marvelous work — I'm in awe. (I'm also envious of the amount of free time you appear to have. :-) However, I have a concern about the overall approach you're taking — the high-level design, as it were.

The grammar in its current state does four separable things:
1. It partitions the input stream into words.
2. It validates the words, rejecting invalid vowel and consonant patterns.
3. It determines the selma'o of a cmavo.
4. It categorizes brivla into gismu, lujvo and fu'ivla.

As a result, the grammar is fearsomely complex in spots. (OK, the part that recognizes selma'o isn't complex; it's just huge.) And it could be argued that categorizing brivla really belongs to semantic analysis, not parsing.

For the sake of modularity and reducing point-complexity, I think it would be worth considering splitting the job into its components, and writing separate grammars:
1. A partitioning grammar that considers an input string, and accepts a word (cmene, brivla, cmavo or non-Lojban) from its head.
2. A validating grammar that considers a Lojban word, and rejects it (re-categorizing it as non-Lojban?) if it has invalid vowel or consonant patterns.
3. Selma'o determination might be more easily described as a symbol table lookup than as a parsing problem.
4. A grammar that considers a valid Lojban brivla, and categorizes it.

Of course this scheme depends on being able to combine multiple PEG-generated parsers into a single program. But if the parser generator takes parameters which can be used to name the input and parser functions, that shouldn't be hard.

Or is there already a consensus that the requirement is for a single grand grammar covering every relevant aspect of the language?

Clark Nelson



posts: 953

On Tue, 21 Dec 2004 wikidiscuss@lojban.org wrote:

> Re: PEG Morphology Algorithm — design
> The grammar in its current state does four separable things:

> 1. It partitions the input stream into words.
...
> 4. It categorizes brivla into gismu, lujvo and fu'ivla.

I believe that it is possible that these two tasks are not
separable. In any case, the current approach of the morphology part
of does it in a way consistent with the traditional (not fully
operationalized) method of determining which words are of what kind.

Basically, a fu'ivla is any word that fits the definition of a
brivla (consonant cluster in first five letters, not counting y or
'), but is not either a gismu or a lujvo. So a fu'ivla is a very
open-ended set of words. When cmavo are preceding a fu'ivla, there
are some potential ambiguities that we have to handle. This is done
via the so-called "slinku'i test", which is explained at:

http://www.lojban.org/tiki/tiki-index.php?page=3Dslinku%27i

In order to do the slinku'i test, we have to know what a lujvo is
like. To know what a lujvo is like, we have to know what a rafsi is
like. Final rafsi can be gismu, so we have to match against that,
too. So, only to separate words consistently in the face of fu'ivla,
we have to implement all of these concepts. So I believe further
modularization is not possible.

--=20
Arnt Richard Johansen http://arj.nvg.org=
/
=ABN=E5r jeg kommer til kloakken, er det for =E5 rense opp - n=E5r Zola=
bes=F8ker det
samme sted, er det for =E5 bade!=BB --Henrik Ibsen


Re: PEG Morphology Algorithm

Humanly readable algorithm for identifying fu'ivla.

A "syllable" is any permissible consonant cluster, or an apostrophe, or nothing, followed by a diphthong or by a single vowel.

Given a string of characters:

1. Check that it does not start with a cmene, a gismu or a lujvo.

2. Check whether it starts with a fu'ivla-head. A fu'ivla-head is something that looks like a cmavo without any y's. If there is no fu'ivla-head, go straight to 3.

A. If the fu'ivla-head is not followed by a consonant cluster, there is no fu'ivla (the head will fall off as a cmavo).

B. If the fu'ivla-head is followed by a non-initial cluster and one or more syllables, we have a fuhivla. If one of the syllables is stressed, the fu'ivla ends with the next syllable, otherwise it ends after the final syllable.

C. If the fu'ivla-head is followed by a permissible cluster, it may fall off. There is one case where it is saved: if only a single syllable follows the cluster, or if the head has a final stress so that it will accept only one more syllable. In those cases we have a fu'ivla.

3. If there is no fu'ivla head, that means we have a cluster. If it is not an initial-cluster, we don't ahve a valid word. If it is an initial cluster, it has to be followed by at least two syllables, and you need to check that adding {le} in front (or any other CV cmavo) does not convert it into a lujvo. If that doesn't happen, we have a fu'ivla.

In summary, we have just three types of possible fu'ivla:

1- Head fu'ivla with non-initial cluster plus tail.
2- Head fu'ivla with initial-cluster plus a single syllable ("short-tail")
3- Headless fu'ivla that pass the slinku'i test

mu'o mi'e xorxes




Re: PEG Morphology Algorithm

Elidable terminators are sometimes required for disambiguation, but they are always allowed, even when not required.

Pauses between words are sometimes required for disambiguation, but they are always allowed, even when not required.

Marking stress with caps is sometimes required for disambiguation, but it is always allowed, even when not required.

Using long rafsi in lujvo instead of the short ones is sometimes required due to morphology constraints, but it is always allowed, even when not required.

There seems to be a pattern there. What about hyphens?

y- and r-hyphens after CVC and CVV rafsi are sometimes required due to morphology constrains, but they are always allowed, even when not required... NOT! When not required they are not allowed!

This seems to go against the Lojban way of doing things, and it is also a burden for the user. You get used to a lujvo like {tosymabru} and when you try to form a new lujvo by adding an additional rafsi:
say {naltosymabru} it turns out it is not valid: it has to be {naltosmabru}. You get used to {ro'inre'o} and when you form {braro'inre'o} it turns out that's not a lujvo, either.

In the PEG grammar, I allowed -y- after any CVC, not only when it is required, mainly because that was easier than disallowing it when it wasn't necessary. It doesn't seem to be worth complicating the grammar for such an unnecessary and bothersome restriction.

I am now allowing the r-hyphen after any non-final CVV as well, because that is more user-friendly. In this case there is a small cost: we take some forms that would otherwise be fu'ivla and put them in lujvo-space, but lujvo have always had priority over fu'ivla so that's not a big deal (and it is not a noticeable chunk of fu'ivla-space anyway).

These are also fairly common mistakes people make when creating lujvo, so this move is actually supported by usage.

mu'o mi'e xorxes






Re: PEG Morphology Algorithm

The current morphology rules say this is a valid fu'ivla:

tstststikptkptsrzgbdbgu

Fortunately, nobody has thought of putting it in jbovlaste yet.

I think the current rules are too permissive for consonant
clusters. I would like to propose the following restriction:

"Normal" fu'ivla will not have any cluster worse than C/CC
(those are the worst that can appear in lujvo), i.e. no
four-consonant cluster allowed, and no three-consonant
cluster unless the second pair is a permissible initial.

Type-three fu'ivla will be of the form:

CVCC-r-normal
CCVC-r-normal
CVC-r-normal

i.e. the crunchy part of type three fu'ivla will be the only
exception to the nothing worse than C/CC rule.

There is only one fu'ivla in jbovlste that doesn't follow
this rule, namely "tarksako". If we wanted to allow such
things, we could include a few more pairs as "permissible
syllable initial", such as perhaps ks, ps, jn, zn and a few
others that were left out as permissible word initials.

Or do we just allow {tstststikptkptsrzgbdbgu} as a valid
fuhivla?

Opinions?

mi'e xorxes




posts: 953

On Tue, 4 Jan 2005, Robin Lee Powell wrote:
X-ecartis-version: Ecartis v1.0.0
Sender: wikidiscuss-bounce@lojban.org
Errors-to: wikidiscuss-bounce@lojban.org
X-original-sender: arj@nvg.org
Precedence: bulk
Reply-to: wikidiscuss-list@lojban.org
X-list: wikidiscuss

> On Tue, Jan 04, 2005 at 09:49:35AM -0800, wikidiscuss@lojban.org wrote:
>> Or do we just allow {tstststikptkptsrzgbdbgu} as a valid fuhivla?
>
> YE GODS NO!
>
> I'm a long-time Lojbanist. It's been Quite A Whlie since any word
> has stumped me, and I refuse to use buffer vowels. I cannot
> pronounce that word. It's *BAD*.

It would be interesting to see some restrictions that can rule out such
words.

And bear this in mind: you may think imposing additional restrictions is a
good idea now. But a few years down the line, you may be trying to
construct a new fu'ivla that contain some consonant cluster that has been
made illegal. Then you'll hate it as much as you hate the "no la/lai/doi
in cmene" rule now.

--
Arnt Richard Johansen http://arj.nvg.org/
Inuktitut iis eesseentiiaallyy Fiinniish aas spooqqeen iin Greenlaand.
--Clint Jackson Baker, via Essentialist Explanations


posts: 1912

> I don't want to outlaw particular clusters, I want to outlaw
> clusters of particular *lengths*.

I think the composition of the clusters can be more
important than the length in judging how much they
differ from "normal" lojban words.

If CCC is a valid initial triple, and C-C is an impermissible
initial pair, then in general I find C/CCC blends much better
with ordinary words than C/C-C. For example "kspl" or "mskr"
are much better than "tkp" or "vgj".

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 1912

> The discretion taken by the coiners of fu'ivla should be enough
> to save us from such unpronouncable words ever getting widespread.

Everything that the official morphology accepts is by definition
pronouncable in Lojban. {tststsipkpkpkpku} is humanly pronouncable,
it just doesn't sit well with ordinary Lojban words. I think we
should define more clearly what kind of words we don't want
getting widespread.

> In any case, I think it is unadvisable to make the already complex
> morphology algorithm even more complex to protect us against this sort of
> thing.

It need not be a complex rule. Compare the current rule with
the one I proposed:

cluster_current <- consonant consonant+

cluster_proposed <- consonant initial-consonant? consonant

A more permissive possibility might be:

cluster_proposed2 <- consonant initial-consonant* consonant

mu'o mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 1912

> We had an argument about {damskrima}, a perfectly
> pronounceable and valid (though ill-concieved; meant {dambrskrima}) fu'ivla
> which vlatai rejects.

Every cluster is pronounceable, it's just that some clusters
are more normal-lojban-like than others. {mskr} is not particularly
bad. In general C/CCC clusters are quite nice, as long as the
CCC part is a nice initial cluster, like skr in this case.

> Words in other languages have all sorts of consonant clusters. We should not
> disallow them in fu'ivla except those that are forbidden in all words (such
> as "ntcb") and initially those that are forbidden initially in all brivla.

Of course. The rules I'm thinking of would apply to all brivla too.

> We
> might want the word {zajbrmtsfrtneli}, for instance. (I leave it to our
> Kartuli expert to define that.)

I wouldn't want it. It doesn't look at all like a Lojban word,
even though the current rules allow it.

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 1912

> But let us
> assume that we start a step along from that,
> with a certified correct transcription in speech
> stream for, noting only Lojban phonemes, pauses
> and stresses. Given that input, can we always
> get either a unique (and correct) segmentation or
> a rejection (also correct)?

Yes.

The differences of opinion between camxes and
valfendi are only about salvaging rejected input.
We could just mark it as invalid, but we try to make
something valid out of it and each uses different
rules for that.

mu'o mi'e xorxes




__
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail.yahoo.com


posts: 2388

wrote:

> --- John E Clifford wrote:
> > But let us
> > assume that we start a step along from that,
> > with a certified correct transcription in
> speech
> > stream for, noting only Lojban phonemes,
> pauses
> > and stresses. Given that input, can we
> always
> > get either a unique (and correct)
> segmentation or
> > a rejection (also correct)?
>
> Yes.
>
> The differences of opinion between camxes and
> valfendi are only about salvaging rejected
> input.
> We could just mark it as invalid, but we try to
> make
> something valid out of it and each uses
> different
> rules for that.

If true, it is unfortunate that this point has
not been made clear and the discussion has
focused on the currently useless stuff that
effectually comes after the "invalid" result. To
be sure, this might ultimately be useful for
dealing with the real world in which mistakes are
made both in production and reception, but I
gather that that project is not really in the
offing.

Has the claim been demonstrated to hold yet? I
don't know about PEG (it — at least under that
name — is after my time of keeping up with
details) so I cannot quite see that it will
always work (I suspect I also have lost some
track of what is permissible nowadays in various
categories; I screwed up on fuhivla recently I
see.)

Laying out the proof would be a useful thing to
do — and not merely for local consumption but
for the linguistic community generally, a first
step toward convincing that community that
structural unambiguity is a real property of
Lojban (as opposed to the frequent opinion that
it is just a product of a peculiar kind of
grammar unrelated to the language as used and
that it requires hand massaging the data to
work). I'm not sure how a proof would go unless
it is just running through all the possible
sequences (in cvV'. format or so) and showing
that each resolves in only one way by the rules.
It is still not obvious that this will work.
(Looking toward the error correction goal it
would be good if this could be done allowing at
least unrequired pauses, the first level of
deviation from perfect articulation — though
classifying pauses as necessary or accidental
does seem to be fairly direct and rule-governed).


posts: 1912

> Laying out the proof would be a useful thing to
> do —

PEG grammars are by construction unambiguous, so
if the PEG grammar is declared the official one,
no additional proof of unambiguity is needed.

What we need to do is convince ourselves that the
PEG grammar does what CLL and other non-formal
descriptions of the grammar say the grammar should do.

The PEG as it stands still has a few bugs, so
it's not ready yet (also, there are some decisions
to be made about which vowel combinations should be
allowed where).

> (Looking toward the error correction goal it
> would be good if this could be done allowing at
> least unrequired pauses, the first level of
> deviation from perfect articulation — though
> classifying pauses as necessary or accidental
> does seem to be fairly direct and rule-governed).

Unrequired but allowed pauses are no problem at all.

You can never pause in the middle of any word, if you
do, you end up with two different words.

The issue with pauses is: should we allow NOT to
pause between certain words under some circumstances, even
though the grammar requires that you do? (For example,
after a stressed cmavo followed by a brivla, or after BY)
Allowing this in some contexts would not cause problems,
but you can certainly not allow it in all contexts.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 2388

wrote:

> --- John E Clifford wrote:
> > Laying out the proof would be a useful thing
> to
> > do —
>
> PEG grammars are by construction unambiguous,
> so
> if the PEG grammar is declared the official
> one,
> no additional proof of unambiguity is needed.
>
> What we need to do is convince ourselves that
> the
> PEG grammar does what CLL and other non-formal
> descriptions of the grammar say the grammar
> should do.

Right; uniqueness is trivial, assuming only that
the parser has exactly one possible output for
each situation and I gather that (unless two
characterizations of the same situation get put
into the rule and I suppose there is a test to
prevent that) being a PEG grammar gurantees that.
It is correctness that is the issue. So the test
still remains of interest even if there is a PEG
parser.

> The PEG as it stands still has a few bugs, so
> it's not ready yet (also, there are some
> decisions
> to be made about which vowel combinations
> should be
> allowed where).

So, as noted, the claim is not yet proven nor
provable with the devices at hand — even if all
decisions had been made?

> > (Looking toward the error correction goal it
> > would be good if this could be done allowing
> at
> > least unrequired pauses, the first level of
> > deviation from perfect articulation — though
> > classifying pauses as necessary or accidental
> > does seem to be fairly direct and
> rule-governed).
>
> Unrequired but allowed pauses are no problem at
> all.
>
> You can never pause in the middle of any word,
> if you
> do, you end up with two different words.

But middle of word pauses are just the second
most likely to occur. To be sure, they would
most often occur bbetween morphological
components (rafsi borders, mainly) but they are
inside words and the intermorpheme breaks do
occur. This is technical error correction and
need not give a unique result, only a "might
meant this or mighta meant that." I think most of
these breaks can be dealt with pretty
convincingly (without offering choices even
maybe), assuming the results are in for the pure
cases.


> The issue with pauses is: should we allow NOT
> to
> pause between certain words under some
> circumstances, even
> though the grammar requires that you do? (For
> example,
> after a stressed cmavo followed by a brivla, or
> after BY)
> Allowing this in some contexts would not cause
> problems,
> but you can certainly not allow it in all
> contexts.

Yes, since at least some of the segmentation
rules rely on a pause, that can be a problem --
one of the reasons for wanting to do away with
required pauses somehow (the more I look at /uy/
the messier it looks except for this
convenience). There is less of a problem if you
get a rejection flat out, for then your error
correction only needs to present a (hopefully
small) list of alternatives. The greater problem
is when dropping a pause gives a justified parse
which is just wrong from the point of view of the
original intention. The need for error
correction is not even triggered in that case or
is so only long after the fact.


posts: 1912

> Jorge Llamb��)B�as scripsit:
> > If it can tell which
> > phonemes are vowels and diphthongs and it can count, then it can
> > use the official orthography to write down what it hears.
>
> It also has to know the rules about cmene.

Not really. It can use "." for an actual pause, and use
space only for calculated end of words. Then the cmene
rule would just absorb those dotless spaces. (That would
deviate a bit from the official orthography.)

> I was, again, thinking of
> an entirely stupid system that generates nothing but a-z, AEIOU, ', and .

If it can do the much more difficult job of detecting phonemes,
doing the official orthography would be trivial.

> (Such stupid systems don't actually exist in the Real World; speech
> transcribers
> have to know quite a lot about the morphology, and even about the syntax,
> in order to achieve even tolerably low error rates.)

Right. And handling capital letters is of course not all that
complicated anyway, both valfendi and camxes do it, so this
discussion is a bit silly, but the morphology rules would
look a bit simpler if they didn't have to to deal with that.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 1912

> It would be interesting to see some restrictions that can rule out such
> words.

One restriction I would want on initial clusters, for example:

Instead of allowing "d-t sibilant" I would allow just
"d-t sibilant !consonant"

That would mean {dz, dj, ts, tc} cannot be followed by a consonant
in initial position. That rules out things like {tcna}, {tspe},
{djmo}, {dzgo} which are all currently permissible initial clusters.

> And bear this in mind: you may think imposing additional restrictions is a
> good idea now. But a few years down the line, you may be trying to
> construct a new fu'ivla that contain some consonant cluster that has been
> made illegal. Then you'll hate it as much as you hate the "no la/lai/doi
> in cmene" rule now.

The problem with the no la/lai/la'i/doi rule is mainly "la", which
is a very frequent syllable in many languages, so this rule is
truly a pain. If the cmene gadri had been say {la'a} instead of {la},
probably nobody would care much about the rule.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 1912

> Any change in allowed initial clusters will change the lexing of words. For
> instance, /ledZGOmfre/ is currently {le dzgomfre}, and {ledzgomfre} is not a
> word. If "dzg" were banned from initial position, /ledZGOmfre/ would be
> {ledzgomfre}.

Yes, of course.

And many strings that are now lexed as words would be lexed as
non-Lojban-words.

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 1912

The rules for allowed initial pairs are:

1) !n !r !l consonant (r /l) (except tl, dl)
2) sibilant consonant (except sx, zn, jn)
3) (d / t) sibilant

When we consider indefinitely long initial clusters,
rule (1) is unproblematic, because once we use it
the cluster must stop, since no initial cluster
begins with r or l.

But rules (2) and (3) can enter into resonance with
catastrophic results. Was it really the intention
to have things like {tststststsa}, {djdjdjdjdja},
{djdzdjdzdjdzu} as valid initial clusters?

It would be very simple to cut this short by changing
rule 3 to:

3') (d / t) sibilant !consonant

i.e. dj, dz, tc and ts could not be further extended
initially. {stsi'i} would still be allowed, but that
would be the limit: {ststu'u} would not be allowed.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 14214

On Tue, Jan 04, 2005 at 09:49:35AM -0800, wikidiscuss@lojban.org
wrote:
> Re: PEG Morphology Algorithm
>
> The current morphology rules say this is a valid fu'ivla:
>
> tstststikptkptsrzgbdbgu

As it turns out, this simply is Not True.

>From the CLL:

In Lojban, doubled consonants are excluded altogether, and
clusters are limited to two or three members, except in
Lojbanized names.

This is in
http://lojban.org/publications/reference_grammar/chapter3.html

Further:

Lojbanized names can begin or end with any permissible
consonant pair, not just the 48 initial consonant pairs listed
above, and can have consonant triples in any location, as long
as the pairs making up those triples are permissible. In
addition, names can contain consonant clusters with more than
three consonants, again requiring that each pair within the
cluster is valid.

There seems to be no confusion on this point.

<mostly joking>
doi la pier .e la xorxes Y'all did *read* Chapter 3 before writing
morphology algorithms, right?
</mostly joking>

-Robin


posts: 1912


> > The current morphology rules say this is a valid fu'ivla:
> >
> > tstststikptkptsrzgbdbgu
>
> As it turns out, this simply is Not True.
>
> From the CLL:
>
> In Lojban, doubled consonants are excluded altogether, and
> clusters are limited to two or three members, except in
> Lojbanized names.

That should read "except in Lojbanized names and fu'ivla",
unless you want to exclude things like {cidjrspageti} too.

> This is in
> http://lojban.org/publications/reference_grammar/chapter3.html
>
> Further:
>
> Lojbanized names can begin or end with any permissible
> consonant pair, not just the 48 initial consonant pairs listed
> above, and can have consonant triples in any location, as long
> as the pairs making up those triples are permissible. In
> addition, names can contain consonant clusters with more than
> three consonants, again requiring that each pair within the
> cluster is valid.
>
> There seems to be no confusion on this point.

{ntc}, {nts}, {ndj}, {ndz} are also excluded from names, although
that paragraph is not quite clear about it.

> <mostly joking>
> doi la pier .e la xorxes Y'all did *read* Chapter 3 before writing
> morphology algorithms, right?
> </mostly joking>

Yes, but reading it is not enough. You also have to interpret it :-)

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 14214

On Thu, Feb 03, 2005 at 01:50:07PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > > The current morphology rules say this is a valid fu'ivla:
> > >
> > > tstststikptkptsrzgbdbgu
> >
> > As it turns out, this simply is Not True.
> >
> > From the CLL:
> >
> > In Lojban, doubled consonants are excluded altogether, and
> > clusters are limited to two or three members, except in
> > Lojbanized names.
>
> That should read "except in Lojbanized names and fu'ivla", unless
> you want to exclude things like {cidjrspageti} too.

Well, *yes*. That would be the point. The language with
{cidjrspageti} is not Lojban.

-Robin


posts: 1912


> > That should read "except in Lojbanized names and fu'ivla", unless
> > you want to exclude things like {cidjrspageti} too.
>
> Well, *yes*. That would be the point. The language with
> {cidjrspageti} is not Lojban.

Well, maybe not the Lojban of Chapter 3, but in Chapter 4
there are plenty of fu'ivla with four-consonant clusters.
The whole concept of type 3 fu'ivla breaks down if you
disallow them. (Not that I would miss it.)

mu'o mi'e xorxes





__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 14214

On Thu, Feb 03, 2005 at 02:14:10PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > > That should read "except in Lojbanized names and fu'ivla",
> > > unless you want to exclude things like {cidjrspageti} too.
> >
> > Well, *yes*. That would be the point. The language with
> > {cidjrspageti} is not Lojban.
>
> Well, maybe not the Lojban of Chapter 3, but in Chapter 4 there
> are plenty of fu'ivla with four-consonant clusters. The whole
> concept of type 3 fu'ivla breaks down if you disallow them. (Not
> that I would miss it.)

Nor would I, but I admit that it seems an unlikely change to pass.

-Robin


posts: 14214

On Thu, Feb 03, 2005 at 02:14:10PM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > > That should read "except in Lojbanized names and fu'ivla",
> > > unless you want to exclude things like {cidjrspageti} too.
> >
> > Well, *yes*. That would be the point. The language with
> > {cidjrspageti} is not Lojban.
>
> Well, maybe not the Lojban of Chapter 3, but in Chapter 4 there
> are plenty of fu'ivla with four-consonant clusters. The whole
> concept of type 3 fu'ivla breaks down if you disallow them. (Not
> that I would miss it.)

I have a proposal to fix this.

I doubt you'll like it.

1. Consonant clusters of >3 are only allowed in names.

2. The r/n/l buffer in stage 3 fu'ivla is not considered a
consonant for purposes of consonant cluster validity.

-Robin


posts: 1912


>
> I have a proposal to fix this.
>
> I doubt you'll like it.
>
> 1. Consonant clusters of >3 are only allowed in names.

I would like the basic morphological restrictions for cmene
to be exactly the same as for other words. Their only special
characteristic would be that they end in a consonant. (I know
that doesn't have much chance, but that's what I would like.)

Consonant clusters like C/CCC where CCC is a permissible
initial are much better than clusters like C/C-C, where
C-C is not a permissible initial (C/C may or may not
be a permissible initial). This is because a syllable can
begin with a permissible initial, and the rest of the cluster
will have to be supported by the preceding syllable. One
consonant can be supported at the end of a syllable, two
consonants takes more effort.

> 2. The r/n/l buffer in stage 3 fu'ivla is not considered a
> consonant for purposes of consonant cluster validity.

Any consonant cluster restriction scheme will have to do
something like that. The question is, would that apply to
type three forms only, or to r/n/l in any position?

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 14214

On Fri, Feb 11, 2005 at 12:15:50PM -0800, Jorge Llamb?as wrote:
> > 2. The r/n/l buffer in stage 3 fu'ivla is not considered a
> > consonant for purposes of consonant cluster validity.
>
> Any consonant cluster restriction scheme will have to do something
> like that. The question is, would that apply to type three forms
> only, or to r/n/l in any position?

I care not at all either way, assuming that r/n (l is only a hyphen
in type-3 fu'ivla, according to CLL) can *always* be syllabic in
Lojban, which is my reccollection. In fact, I'd be fine with
"consonant clusters work like this, but syllabic consonants don't
count".

-Robin


posts: 953

On Fri, 11 Feb 2005, Robin Lee Powell wrote:

> I have a proposal to fix this.
>
> I doubt you'll like it.
>
> 1. Consonant clusters of >3 are only allowed in names.
>
> 2. The r/n/l buffer in stage 3 fu'ivla is not considered a
> consonant for purposes of consonant cluster validity.

Actually, that doesn't sound too bad.

--
Arnt Richard Johansen http://arj.nvg.org/
Let's have some real examples from a real, non-English language.


posts: 14214

On Fri, Feb 11, 2005 at 09:34:45PM +0100, Arnt Richard Johansen wrote:
> On Fri, 11 Feb 2005, Robin Lee Powell wrote:
>
> >I have a proposal to fix this.
> >
> >I doubt you'll like it.
> >
> >1. Consonant clusters of >3 are only allowed in names.
> >
> >2. The r/n/l buffer in stage 3 fu'ivla is not considered a
> >consonant for purposes of consonant cluster validity.
>
> Actually, that doesn't sound too bad.

The "you" was directed at xorxes. I was pleasantly surprised.

-Robin


On Friday 11 February 2005 15:15, Jorge "Llambías" wrote:
> I would like the basic morphological restrictions for cmene
> to be exactly the same as for other words. Their only special
> characteristic would be that they end in a consonant. (I know
> that doesn't have much chance, but that's what I would like.)

So {mkyveix} would be invalid because "mk" isn't a valid initial?

> Consonant clusters like C/CCC where CCC is a permissible
> initial are much better than clusters like C/C-C, where
> C-C is not a permissible initial (C/C may or may not
> be a permissible initial). This is because a syllable can
> begin with a permissible initial, and the rest of the cluster
> will have to be supported by the preceding syllable. One
> consonant can be supported at the end of a syllable, two
> consonants takes more effort.
>
> > 2. The r/n/l buffer in stage 3 fu'ivla is not considered a
> > consonant for purposes of consonant cluster validity.
>
> Any consonant cluster restriction scheme will have to do
> something like that. The question is, would that apply to
> type three forms only, or to r/n/l in any position?

If we're going to do this, it should apply to r/n/l in any position. Looking
at fu'ivla in jbovlaste, I found one which would be invalid if it didn't:
{zermbeto} (a plant in the ginger family). A word I've proposed, but not used
or defined, is {zajbrmtsfrtneli} (the Georgian word "mtsvrtneli" is glossed
as "trainer", but I'm not sure what meaning is intended, so I'm not sure of
the right gism). vlatai considers 'm' to be syllabic as well; if this were
done, it would still be invalid because of "tsf". I picked this as a
worst-case actual word in a foreign language that we might like to make a
fu'ivla of.

phma
--
We light this candle to the tin between new and voice.


posts: 1912


> On Friday 11 February 2005 15:15, Jorge "Llambías" wrote:
> > I would like the basic morphological restrictions for cmene
> > to be exactly the same as for other words. Their only special
> > characteristic would be that they end in a consonant. (I know
> > that doesn't have much chance, but that's what I would like.)
>
> So {mkyveix} would be invalid because "mk" isn't a valid initial?

That's how I would have it, yes. If {mkyveix} is pronounceable in
normal Lojban speech, then it is hard to understand why we don't
allow {mkivexi} as a fu'ivla, for example, or why we don't allow
{mkive} as a gismu. Cmene are lojbanized words and therefore
should have the same general morphological restrictions as other
words.

....
> > Any consonant cluster restriction scheme will have to do
> > something like that. The question is, would that apply to
> > type three forms only, or to r/n/l in any position?
>
> If we're going to do this, it should apply to r/n/l in any position. Looking
> at fu'ivla in jbovlaste, I found one which would be invalid if it didn't:
> {zermbeto} (a plant in the ginger family).

I tend to agree that having special a special rule for type-3 is not
very nice.

> A word I've proposed, but not used
> or defined, is {zajbrmtsfrtneli} (the Georgian word "mtsvrtneli" is glossed
> as "trainer", but I'm not sure what meaning is intended, so I'm not sure of
> the right gism). vlatai considers 'm' to be syllabic as well; if this were
> done, it would still be invalid because of "tsf". I picked this as a
> worst-case actual word in a foreign language that we might like to make a
> fu'ivla of.

I don't think that's the kind of meaning we want borrowed words for, but
the point is the same if it was say a distinctive Georgian food or
music or something else that would make sense as a borrowing. In that
case, we should adapt it to Lojban morphology, for example as
{mitsfirtineli} or similar. Just as we change the voicedness of some
consonants we can introduce epenthetic vowels in the lojbanization to
conform to the morphology.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


On Saturday 12 February 2005 10:40, Jorge "Llambías" wrote:
> That's how I would have it, yes. If {mkyveix} is pronounceable in
> normal Lojban speech, then it is hard to understand why we don't
> allow {mkivexi} as a fu'ivla, for example, or why we don't allow
> {mkive} as a gismu. Cmene are lojbanized words and therefore
> should have the same general morphological restrictions as other
> words.

The rules for lexing cmene are different from the rules for lexing brivla.
/lamkyveix/ is two words, {la mkyveix}, while /lamkiVExi/ is one word,
{lamkivexi}. Both are pronounceable, they just break differently.

phma
--
Maintenant, j'ai besoin d'une loupe pour trouver mes lunettes!
-Les Perles de la médecine


posts: 1912


> On Saturday 12 February 2005 10:40, Jorge "Llambías" wrote:
> > That's how I would have it, yes. If {mkyveix} is pronounceable in
> > normal Lojban speech, then it is hard to understand why we don't
> > allow {mkivexi} as a fu'ivla, for example, or why we don't allow
> > {mkive} as a gismu. Cmene are lojbanized words and therefore
> > should have the same general morphological restrictions as other
> > words.
>
> The rules for lexing cmene are different from the rules for lexing brivla.

Indeed they are. Sometimes unjustifiably different.

> /lamkyveix/ is two words, {la mkyveix}, while /lamkiVExi/ is one word,
> {lamkivexi}. Both are pronounceable, they just break differently.

Right. I find that unsatisfying. I don't want my brain to have to
develop two completely separate modules for breaking up words, as
it were.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912


One other thing I forgot about cmene-rafsi: they can only
be used word-initially, because they will absorb anything
else in front of them as part of the cmene.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


I'd like the morphology to be expressible with an algorithm like valfendi, as
well as with a PEG. So here is my attempt to express the proposed limits on
consonant clusters:
1. Replace all vowels, apostrophes, and syllabic consonants ('r', 'n', 'l',
'm') with spaces.
2. Check each remaining cluster:
A. If it has one consonant, it's OK.
B. Remove the first consonant. If what remains is a valid initial cluster,
it's OK, else it isn't.

phma
--
1 m = 3*3*5*7*47*44351/73/293339 * Cs133


posts: 1912


> So here is my attempt to express the proposed limits on
> consonant clusters:
> 1. Replace all vowels, apostrophes, and syllabic consonants ('r', 'n', 'l',
> 'm') with spaces.
> 2. Check each remaining cluster:
> A. If it has one consonant, it's OK.
> B. Remove the first consonant. If what remains is a valid initial cluster,
> it's OK, else it isn't.

I don't much like allowing indefinitely long clusters, including
things like {apmpmpmpma}. What about allowing at most one syllabic
consonant per cluster? That's all that is needed for type-3 fu'ivla.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


On Monday 14 February 2005 07:51, Jorge "Llambías" wrote:
> I don't much like allowing indefinitely long clusters, including
> things like {apmpmpmpma}. What about allowing at most one syllabic
> consonant per cluster? That's all that is needed for type-3 fu'ivla.

That would forbid {cipnrxuazine}, {finprvandeli}, {gurnrtefi}, {jinmrberilo},
{jinmrniobi}, {koblrsinapi}, {nargrpistaco}, and several others.

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


posts: 1912



> > What about allowing at most one syllabic
> > consonant per cluster? That's all that is needed for type-3 fu'ivla.
>
> That would forbid {cipnrxuazine}, {finprvandeli}, {gurnrtefi}, {jinmrberilo},
> {jinmrniobi}, {koblrsinapi}, {nargrpistaco}, and several others.

No, no, I mean at most one syllabic consonant in syllabic position:

cip,nr,xua,zi,ne
fin,pr,van,de,li
jin,mr,be,ri,lo
jin,mr,nio,bi
kob,lr,si,na,pi
nar,gr,pi,sta,co

Those would all be accepted. Indeed all type-3 are accepted
as long as part after the classifier begins with a permissible
initial, because the r/n/l would be the only syllabic consonant
of the crunchy cluster.

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


On Monday 14 February 2005 09:00, Jorge "Llambías" wrote:
> No, no, I mean at most one syllabic consonant in syllabic position:
>
> cip,nr,xua,zi,ne
> fin,pr,van,de,li
> jin,mr,be,ri,lo
> jin,mr,nio,bi
> kob,lr,si,na,pi
> nar,gr,pi,sta,co

If I removed initial "ji", would that be "nm,rn,io,bi" or "nm,rnio,bi"? How
exactly do you figure out what's in syllabic position? Is {damskrima}
allowed?

phma
--
li ze te'a ci vu'u ci bi'e te'a mu du
li ci su'i ze te'a mu bi'e vu'u ci


posts: 1912


> > cip,nr,xua,zi,ne
> > fin,pr,van,de,li
> > jin,mr,be,ri,lo
> > jin,mr,nio,bi
> > kob,lr,si,na,pi
> > nar,gr,pi,sta,co
>
> If I removed initial "ji", would that be "nm,rn,io,bi" or "nm,rnio,bi"?

{nm} is not a permissible initial, so it can't start a word.
{rnio} could not be a syllable, because {rn} is not a permissible initial.

> How
> exactly do you figure out what's in syllabic position?

All syllables are of the form {permissible-initial-cluster vowel(s)},
or {consonant r/n/l}.

(I would not include m as syllabic, but it could be added as well.)

> Is {damskrima} allowed?

Yes, but it doesn't have any consonant syllables:
dam,skri,ma

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912


This is how I would define syllables:

A "consonantal syllable" is a syllable of the form CR,

where C is any consonant and R is one of
Question Plugin disabled
Plugin r cannot be executed.

(obviously excepting the forbidden pairs, rr, nn and ll)

A "vocalic syllable" is a syllable of the form IVF,
where I is a permissible initial cluster, a single consonant,
an apostrophe or nothing, V is a vowel or a diphthong and
and F is any consonant or nothing.

A consonantal syllable can only occur between two vocalic
syllables.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


On Monday 14 February 2005 09:53, Jorge "Llambías" wrote:
> This is how I would define syllables:
>
> A "consonantal syllable" is a syllable of the form CR,

> where C is any consonant and R is one of
Question Plugin disabled
Plugin r cannot be executed.

> (obviously excepting the forbidden pairs, rr, nn and ll)
>
> A "vocalic syllable" is a syllable of the form IVF,
> where I is a permissible initial cluster, a single consonant,
> an apostrophe or nothing, V is a vowel or a diphthong and
> and F is any consonant or nothing.
>
> A consonantal syllable can only occur between two vocalic
> syllables.

That would make the following cmene invalid:
arktik (which I call tolzip)
island
nederland
paludizm

phma
--
AS d- s-: a+ c+++ p+ t f S+ e++ h r->++ n-(++)* i P- m++ M+


posts: 1912


> On Monday 14 February 2005 09:53, Jorge "Llambías" wrote:
> > A consonantal syllable can only occur between two vocalic
> > syllables.
>
> That would make the following cmene invalid:
> arktik (which I call tolzip)
> island
> nederland
> paludizm

The final cluster in cmene is a special case, which probably
needs to be dealt with separately. I wouldn't mind requiring
cmene to end in a single consonant, but otherwise the rule
for cmene could be to allow a bigger cluster in the final
syllable (we need to decide what kind of cluster, but maybe
something like "reverse permissible initial cluster" would
work).

{arktik} on the other hand would be invalid, yes, and so would
{fasxolarkto}. To allow these we would need to allow syllables
that end in more than a single consonant.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 14214

On Mon, Feb 14, 2005 at 04:51:51AM -0800, Jorge Llamb?as wrote:
>
> --- Pierre Abbat wrote:
> > So here is my attempt to express the proposed limits on
> > consonant clusters:
> > 1. Replace all vowels, apostrophes, and syllabic consonants ('r', 'n', 'l',
> > 'm') with spaces.
> > 2. Check each remaining cluster:
> > A. If it has one consonant, it's OK.
> > B. Remove the first consonant. If what remains is a valid initial cluster,
> > it's OK, else it isn't.
>
> I don't much like allowing indefinitely long clusters, including
> things like {apmpmpmpma}. What about allowing at most one syllabic
> consonant per cluster? That's all that is needed for type-3
> fu'ivla.

Fine by me.

-Robin

--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/


posts: 14214

On Mon, Feb 14, 2005 at 06:33:49AM -0800, Jorge Llamb?as wrote:
> --- Pierre Abbat wrote:
snip
> > How exactly do you figure out what's in syllabic position?
>
> All syllables are of the form {permissible-initial-cluster
> vowel(s)}, or {consonant r/n/l}.
>
snip
> > Is {damskrima} allowed?
>
> Yes, but it doesn't have any consonant syllables: dam,skri,ma

{dam} sure doesn't look like {permissible-initial-cluster vowel(s)}.

-Robin

--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/


posts: 1912


> On Mon, Feb 14, 2005 at 06:33:49AM -0800, Jorge Llamb?as wrote:
> > --- Pierre Abbat wrote:
> > > Is {damskrima} allowed?
> > Yes, but it doesn't have any consonant syllables: dam,skri,ma
>
> {dam} sure doesn't look like {permissible-initial-cluster vowel(s)}.

Right, the full form of a syllable would be:

(initial-cluster / consonant / h)? (vowel / diphthong) consonant?

although it is not yet clear which diphthongs would be allowed,
and whether a syllable that begins with a vowel can follow one
that ends with a vowel.

Another possibility would be something like:

(initial-cluster / consonant / h / semivowel / .) (vowel / diphthong)
consonant?

where diphthong is restricted to ai, au, ei, oi.

Yet another possibility:

(initial-cluster / consonant / h / semiconsonant / .) vowel (consonant /
semiconsonant)?

so that ai/au/ei/oi can't absorb a following consonant in the same
syllable. I think I like this one.

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


On Monday 14 February 2005 09:33, Jorge "Llambías" wrote:
> All syllables are of the form {permissible-initial-cluster vowel(s)},
> or {consonant r/n/l}.
>
> (I would not include m as syllabic, but it could be added as well.)

According to Chapter 3 it is syllabic. There are also example names {brlgan}
and {rl}, and we also know who the god of hesitation is.

phma
--
A man found gold and left a rope; but he who found
No gold he left did tie the rope around.


posts: 1912



> > (I would not include m as syllabic, but it could be added as well.)
>
> According to Chapter 3 it is syllabic.

Yes, but has it been used? The examples I can think with syllabic
m sound awful to me. {arpmbla}?

> There are also example names {brlgan}
> and {rl}, and we also know who the god of hesitation is.

Also {clsn}.

How about this: A consonantal syllable must occur between two
vocalic syllables in fu'ivla, but it may appear in any position
in cmene. But I would prefer to exclude m as syllabic in such
case.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


On Monday 14 February 2005 15:34, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > > (I would not include m as syllabic, but it could be added as well.)
> >
> > According to Chapter 3 it is syllabic.
>
> Yes, but has it been used? The examples I can think with syllabic
> m sound awful to me. {arpmbla}?

How about {kolmba}?

phma
--
li fi'u vu'u fi'u fi'u du li pa


On Monday 14 February 2005 10:38, Jorge "Llambías" wrote:
> The final cluster in cmene is a special case, which probably
> needs to be dealt with separately. I wouldn't mind requiring
> cmene to end in a single consonant, but otherwise the rule
> for cmene could be to allow a bigger cluster in the final
> syllable (we need to decide what kind of cluster, but maybe
> something like "reverse permissible initial cluster" would
> work).

I wouldn't do that, since "zm" is an initial cluster but "mz" is forbidden.
How about any allowed medial cluster? If it ends with a syllabic consonant
preceded by another consonant, it would be syllabic, as in {ctelr}.

> {arktik} on the other hand would be invalid, yes, and so would
> {fasxolarkto}. To allow these we would need to allow syllables
> that end in more than a single consonant.

I originally said {fasxolarto}, but Nick complained that he would have
recognized the Greek word for bear if I had said {fasxolarkto}. Since "artos"
means bread, and "arkos" is a variant of "arktos", how about {fasxolarko}?

phma
--
Sans lunettes, je ne distingue même pas les odeurs...
-Les Perles de la médecine


posts: 1912


> On Monday 14 February 2005 15:34, Jorge "Llambías" wrote:
> > --- Pierre Abbat wrote:
> > > > (I would not include m as syllabic, but it could be added as well.)
> > >
> > > According to Chapter 3 it is syllabic.
> >
> > Yes, but has it been used? The examples I can think with syllabic
> > m sound awful to me. {arpmbla}?
>
> How about {kolmba}?

That one's a bit better, but the others will still be there. Anyway,
n is almost as bad as m in many cases, and we can't get rid of it
because of the type-3, so I guess we might as well allow m too.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 1912



> On Monday 14 February 2005 10:38, Jorge "Llambías" wrote:
> > The final cluster in cmene is a special case, which probably
> > needs to be dealt with separately. I wouldn't mind requiring
> > cmene to end in a single consonant, but otherwise the rule
> > for cmene could be to allow a bigger cluster in the final
> > syllable (we need to decide what kind of cluster, but maybe
> > something like "reverse permissible initial cluster" would
> > work).
>
> I wouldn't do that, since "zm" is an initial cluster but "mz" is forbidden.

The "mz" restriction is weird. (The "zn" and "jn" restriction as valid
initials is also pretty weird.)

> How about any allowed medial cluster? If it ends with a syllabic consonant
> preceded by another consonant, it would be syllabic, as in {ctelr}.

That seems right. I'll do that then.

If we are allowing any number of consonantal syllables in a row in
cmene that will still allow some ugly stuff though.

>
> > {arktik} on the other hand would be invalid, yes, and so would
> > {fasxolarkto}. To allow these we would need to allow syllables
> > that end in more than a single consonant.
>
> I originally said {fasxolarto}, but Nick complained that he would have
> recognized the Greek word for bear if I had said {fasxolarkto}. Since "artos"
>
> means bread, and "arkos" is a variant of "arktos", how about {fasxolarko}?

Sounds good to me.

mu'o mi'e xorxes





__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


On Monday 14 February 2005 19:02, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > On Monday 14 February 2005 15:34, Jorge "Llambías" wrote:
> > > Yes, but has it been used? The examples I can think with syllabic
> > > m sound awful to me. {arpmbla}?
> >
> > How about {kolmba}?
>
> That one's a bit better, but the others will still be there. Anyway,
> n is almost as bad as m in many cases, and we can't get rid of it
> because of the type-3, so I guess we might as well allow m too.

There are some lujvo with 'm' in consonant clusters that I find sound weird,
such as {refmri} and {bifmlo}. What does {arpmbla} mean, btw?

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


posts: 1912


> There are some lujvo with 'm' in consonant clusters that I find sound weird,
> such as {refmri} and {bifmlo}.

Once I got used to ml and mr as initials, those are ok for me.

> What does {arpmbla} mean, btw?

AFAIK nothing, I just made it up, a possible cluster with syllabic m.

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 162

Jorge Llambas wrote:

> The "mz" restriction is weird. (The "zn" and "jn" restriction as valid

The question that we weighed is whether speakers would clearly
distinguish the voiced from the unvoiced equivalent.

After several trials we found that minimal pairs such as the following
cause problems.

ramsau/ramzau
vecnau/vejnau
misnau/miznau

Whether they do for all speakers, I can't say. But that is what we
decided at the time.

The bottom line for me on issues like this, is that there is a status
quo, and the byfy should not be considering a change merely because it
can. Unless the existing language is *broken*, I would like to know why
a change to the permissible initials or medials is even under consideration.

That is my general reaction to the entire morphology proposal. I see
lots of unjustified changes in the status quo, and I don't see why most
of them are under consideration, given the byfy charter. I'm not happy
about most changes in the cmavo, but at least there the argument can be
raised that the existing language has semantics problems. I've heard
few if any morphology *problems* to warrant a change other than a couple
that are in the "controversial" area - la/lai/doi in names, and the
possibility of including optional hyphens where they now are
forbidden. Almost everything else (e.g. all the new ways of making
words into strange rafsi) seems new and unmotivated other than perhaps
by the argument "we can, so let's allow it".

So as usual, I expect to vote no and have Robin throw my vote out (or
simply not vote. to spare him the decision).

lojbab




posts: 14214

On Tue, Feb 15, 2005 at 07:29:41PM -0500, Bob LeChevalier wrote:
> Jorge Llamb?as wrote:
> The bottom line for me on issues like this, is that there is a
> status quo, and the byfy should not be considering a change merely
> because it can. Unless the existing language is *broken*, I would
> like to know why a change to the permissible initials or medials
> is even under consideration.

Because there has never been a formal morphology before, and we're
trying to create one.

> That is my general reaction to the entire morphology proposal. I see
> lots of unjustified changes in the status quo,

We don't know what the status quo *is*.

Furthermore, the thread you are responding to was clearly stated to
be speculative, as far as I could tell. xorxes said he wanted an
explanation for the mz thing.

I'm not a big fan of randomly changing a bunch of stuff either,
FWIW, but I've deliberately bowed out of the morphology discussion
for the most part, because I don't understand it.

-Robin


posts: 1912



> Jorge Llambías wrote:
>
> > The "mz" restriction is weird. (The "zn" and "jn" restriction as valid
>
> The question that we weighed is whether speakers would clearly
> distinguish the voiced from the unvoiced equivalent.
>
> After several trials we found that minimal pairs such as the following
> cause problems.
>
> ramsau/ramzau
> vecnau/vejnau
> misnau/miznau
>
> Whether they do for all speakers, I can't say. But that is what we
> decided at the time.

jn and zn are allowed medially. They are only forbidden initially.

> The bottom line for me on issues like this, is that there is a status
> quo, and the byfy should not be considering a change merely because it
> can. Unless the existing language is *broken*, I would like to know why
> a change to the permissible initials or medials is even under consideration.

No change to permissible initial or medial pairs has been proposed.
(I find many of the restrictions arbitrary, but have not proposed
changes to those.)

The proposed change is to some initial and medial clusters of more
than two consonants. For me at least {tctcikptkpu} qualifies as
broken, or allowing initial {tctcla} but not {tla}.

> Almost everything else (e.g. all the new ways of making
> words into strange rafsi) seems new and unmotivated other than perhaps
> by the argument "we can, so let's allow it".

That was motivated by the embryonic but incomplete proposal in CLL.

> So as usual, I expect to vote no and have Robin throw my vote out (or
> simply not vote. to spare him the decision).

Blanket no, or you may consider specific points?

I suppose we will have to vote on each issue separately, since many
are fairly independent of each other.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 14214

On Tue, Feb 15, 2005 at 07:29:41PM -0500, Bob LeChevalier wrote:
> So as usual, I expect to vote no and have Robin throw my vote out
> (or simply not vote. to spare him the decision).

Two things:

1. "Your proposal states X, Y and Z. Those are unacceptable to me,
ubt I would accept what you have if those said Q, R and S instead;
my reasons are A, B and C" counts as a counter-proposal. It seems
to me that this is not unreasonable. OTOH, "Your proposal states X,
Y and Z, and I don't like that" does not count as a
counter-proposal.

2. I'm getting really sick of you complaining about this. Either
put it to a vote of the BPFK membership or stop bringing it up,
please. It seems to me that your BPFK involvement has consisted of
wandering in every six months, reading a few mails, flaming people,
sulking about this issue, and leaving. I'd like this pattern to
stop.

Please don't respond to this mail with hundreds of lines about why
your are acting the way you are. I'm busy enough as it is.

Thanks.

-Robin


On Tuesday 15 February 2005 19:44, Jorge "Llambías" wrote:
> > Almost everything else (e.g. all the new ways of making
> > words into strange rafsi) seems new and unmotivated other than perhaps
> > by the argument "we can, so let's allow it".
>
> That was motivated by the embryonic but incomplete proposal in CLL.

The embryonic proposal was to allow some, but not all, fu'ivla to have rafsi.
I came up with a method to allow arbitrarily long, but not all, fu'ivla to
have rafsi. CLL clearly states that general methods of making lujvo with
fu'ivla rafsi failed, and doesn't even hint at making rafsi from cmene.

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine


posts: 14214

On Tue, Feb 15, 2005 at 05:08:55PM -0800, Robin Lee Powell wrote:
> On Tue, Feb 15, 2005 at 07:29:41PM -0500, Bob LeChevalier wrote:
> > So as usual, I expect to vote no and have Robin throw my vote
> > out (or simply not vote. to spare him the decision).
>
> Two things:

Holy *crap* was that not intended to go here. :-/

My apologies, Bob, for inadvertantly bringing a private rant out in
the open.

-Robin


posts: 1912



> The embryonic proposal was to allow some, but not all, fu'ivla to have rafsi.

Yes.

> I came up with a method to allow arbitrarily long, but not all, fu'ivla to
> have rafsi.

Yes, and that has been adopted wholesale in the PEG grammar.

> CLL clearly states that general methods of making lujvo with
> fu'ivla rafsi failed, and doesn't even hint at making rafsi from cmene.

Right, but my general method for fu'ivla does not fail, so CLL is
wrong in suggesting that it can't be done. Surely if we allow a
complex method that allows rafsi to _some_ select fu'ivla we
can allow as well a simpler and more general method that gives rafsi
to _every_ fu'ivla.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 162

Robin Lee Powell wrote:

>>That is my general reaction to the entire morphology proposal. I see
>>lots of unjustified changes in the status quo,
>>
>>
>We don't know what the status quo *is*.
>
>
We know a lot of things that it isn't. Most of the things I can
identify as "changes" are to those things.

>Furthermore, the thread you are responding to was clearly stated to
>be speculative, as far as I could tell.
>
I'm sorry, but you said in response to the discussion of Nora's post
that all discussions of the morphology proposal which you gave us two
weeks before a vote on, were to be in this thread. Therefore, I made
the possibly faulty assumption that discussions in this thread were on
the finalized proposal that we have two weeks to review before voting
on. I would otherwise have avoided this thread which has been going on
for several months. Since I read this in email, I can't see any
difference between discussions of stuff that the byfy has to consider
and stuff that people are discussing speculatively - they all look the
same to me.

> xorxes said he wanted an
>explanation for the mz thing.
>
>I'm not a big fan of randomly changing a bunch of stuff either,
>FWIW, but I've deliberately bowed out of the morphology discussion
>for the most part, because I don't understand it.
>
>
And that is why I pushed Nora to get involved, because she does
understand the issues, even if she is entirely opposed to the philosophy
of how the problem has been tackled (which I can summarize best as
designing without agreeing on the requirements, and which she mentioned
in her criticism of the algorithm as a computer program without clear
expectations on the result.)

lojbab



posts: 1912


> And that is why I pushed Nora to get involved, because she does
> understand the issues, even if she is entirely opposed to the philosophy
> of how the problem has been tackled (which I can summarize best as
> designing without agreeing on the requirements, and which she mentioned
> in her criticism of the algorithm as a computer program without clear
> expectations on the result.)

It's hard to satisfy everybody about everything.
Some people want a formal specification of the morphology
that leaves no doubt as to what is allowed and what is
not, others prefer an informal one that is easier to read.
Ideally we will have both at the end of this process.

This is the design requirement I'm working with: "Given
any string of sounds/characters, the morphology should
return a unique correct break of the string into words,
including non-lojban words."

These are the constraints I'm taking as fixed
(I may be forgetting some):

1- Any string with no spaces/pauses that contains at least one
non-lojban character/sound is taken to be a non-lojban word.

2- Any string with no spaces/pauses that contains at least one
impermissible consonant pair (per list in CLL) is taken to be
a non-lojban word.

3- The apostrophe can only occur between two vowels in lojban
words.

4- A space/pause is required in front of a word that begins with
a vowel (beginning and end of text count as space/pause).

5- cmene must end with a consonant followed by a space/pause.

6- cmavo must be of the form zero or one consonant followed
by one or more vowels, with or without intervening apostrophes.

7- gismu must be either CVC/CV or CCVCV, where CC is a
permissible initial pair as listed in CLL.

8- gismu rafsi are of the form CVC/Cy, CCVCy, CVC(y), CCV, CV(')V(r/n).

9- lujvo consist of any string of rafsi that doesn't fail the
tosmabru test, i.e. such that a cmavo can't be taken from the front
leaving something that breaks as lojban words.

10- A fu'ivla is a string that starts with a permissible initial
cluster, a consonant or a vowel, ends with a vowel, it is not a
cmavo, gismu or lujvo, does not contain y, and passes the tosmabru
and slinku'i tests.

In addition to those constraints which I take as fixed, there
are some issues to sort out:

Issue1: Vowel clusters. What is allowed and what isn't?
Are there any universal constraints for all lojban words?
Are there specific constraints for specific types of words?

Issue2: Consonant clusters. Do we want a somewhat more restrictive
set of initial clusters for brivla than those allowed by CLL?
(The proposed restrictiuon is that affricates (tc, ts, dj, dz)
should not be combinable with anything else in initial clusters.)
Should there be a restriction on medial clusters? (The proposed
restriction is that vocalic syllables should be at most of the
form initial-cluster vowel/diphthong single-consonant, and
consonantal syllables of the form single-consonant syllabic-consonant.)

Issue3: fu'ivla rafsi, general brivla rafsi, general cmavo rafsi,
cmene rafsi. Should we allow them?

Issue4: CVC(y) and CVV(r/n) hyphens. Should they be allowed in any
position?

Issue5: doi/la/lai/la'i restriction. Should we remove it?

Issue6: Should the pause after Cy cmavo be obligatory, or required
only when needed for unambiguity? Should the pause after cmavo
with final-stress be obligatory, or required only when needed
for unambiguity?

Issue7: (Relatively minor) Should we allow stress in syllables
that shouldn't be stressed but such that no ambiguity results?

Issue8: Maybe something else I'm forgetting.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 162

Jorge Llambas wrote:

>--- Bob LeChevalier wrote:
>
>>orge Llambas wrote:
>>
>jn and zn are allowed medially. They are only forbidden initially.
>
>
Then I misunderstood your question.

I believe that we based the set of permissible initials on JCB's list,
adding some extensions based on linguistic symmetry. We in no way
attempted to push the limits of what could be spoken and thus erred on
the side of not adding things. We initially allowed j and z only with
the voiced stops, in symmetry with allowing c and s with unvoiced
stops. JCB allowed c and s with the four voiced fluids (lmnr).

JCB had allowed jm (indeed it is the only initial cluster he allowed
with j), so we kept it and added the symmetric zm. I don't think we
considered any of the other fluids with j and z.

JCB had also allowed zv which wasn't symmetrical to anything, but Gary
and Tommy argued that linguistically the v and f should be classed with
the voiced/unvoiced stop pairs, to match what JCB had done with them in
initial position (allowing fl and fr, vl and vr). This dictated the
choice in symmetry rules between zv/sv/cv/jv and zv/jv/sf/cf in favor
of the latter. The former would have suggested treating f and v like
the sz and cj pairs, and we would have then considered things like
fp/fk/ft/fm/fn/fs and vb/vd/vg/vm/vz I suspect that Gary as a Russian
aficionado supported such things, but Nora and I though those went too far.


>No change to permissible initial or medial pairs has been proposed.
>(I find many of the restrictions arbitrary, but have not proposed
>changes to those.)
>
>
I think I have misunderstood some of the "controversial items" then,
which I thought were part of the proposal.

>The proposed change is to some initial and medial clusters of more
>than two consonants. For me at least {tctcikptkpu} qualifies as
>broken, or allowing initial {tctcla} but not {tla}.
>
This is one reason why Nora favored starting by collecting all the
pronouncements in CLL in one place, and attempting to resolved any
ambiguities or conflicts in the specification before trying to code an
algorithm. We should be resolving such questions first, then writing an
algorithm

Chapter 3 section 6 lists permissible initials. Section 7 says that
medially there can be triples. It also says that in NAMES, any length
of consonant string goes so long as each pair is valid for the position,
but it never gives rules for what is valid in longer-than-3 letter
strings. The list is probably not open-ended, since we put some
restrictions on 3-letter strings. It does not say that fu'ivla can use
more-than-three consonants. This is in part because fu'ivla were so
much an afterthought that they were poorly considered throughout
chapters 3 and 4. We considered the bounds on fu'ivla morphology to be
an experiment, with limits to be determined by usage.

Chapter 4 section 7 seems to open things up for fu'ivla, but is
ambiguous whether an initial cluster can be more than three consonants -
there are no examples of any. It seems that it permits any length of
cluster in medial position. But before making chapter 4 work, we need
to make chapter 3 correct and clear, and define rules for clusters of
more than three consonants in names (and allow them in fu'ivla).

If we allow the open-ended clusters that permit something like
"tctcikptkpu", remember that we will be doing so because that is the
closest word-form to something in the source language. Presumably, if
the speakers have such clusters in their source language, they can also
pronounce it in Lojban. Whether we want Lojban to support loose
borrowing of anything that can go in any other language is quite
arguable. My own leaning is for tighter restrictions on names and
fu'ivla than the current anything-goes policy.

>Almost everything else (e.g. all the new ways of making
>words into strange rafsi) seems new and unmotivated other than perhaps
>by the argument "we can, so let's allow it".
>
>
>That was motivated by the embryonic but incomplete proposal in CLL.
>
>
The proposal in CLL was for one *particular* kind of fu'ivla to be given
rafsi, which were chosen because it was pretty clear that those would
work with no fiddling in the resolvability algorithm. I was actually
kind of thinking of those fu'ivla to be an extension of gismu space as
much as a means of giving rafsi to fu'ivla, because it arose in response
to the arguments about cultural words that had been included vs. left
out of the gismu space. Giving a big chunk of additional space for
culture words that could be rafsi'd almost as well as the ones in gismu
space was a possible compromise to make that argument resolvable. Lack
of usage of the thing in several years seems to me to have been usage
deciding against it. But if people still want to keep it in the
language in some well-defined form, I won't oppose it.

It is unclear what a generalized rafsi scheme using "'y" or "iy" offers
over the current scheme using "zei" which requires no fiddling with the
morphology algorithm. Either way adds a buffering syllable for the
"joining". Only those wordforms that can be rafsi'd by replacing the
final vowel with y keep from adding the extra syllable, which is why I
kept the rafsi fu'ivla proposal minimal - it offers too little benefit
except in well-constrained cases.

>>So as usual, I expect to vote no and have Robin throw my vote out (or
>>simply not vote. to spare him the decision).
>>
>>
>
>Blanket no, or you may consider specific points?
>I suppose we will have to vote on each issue separately, since many
>are fairly independent of each other.
>
>
>
Again we seem to have many misunderstandings of what is going on. Robin
said there is a proposal and there will be a vote in 2 weeks, when Nora
and I did not even know a proposal was being prepared - especially since
Nora thought that she was the "shepherd" albeit without any sheep.

(I thought the PEG algorithm work that Robin was doing was a separate
project not related to the byfy task, just as he had been working on a
parser separate from the YACC-based grammar. Nora was given a
quasi-autonomous subcommittee by Nick to work on morphology, which no
one but Pierre volunteered for, and Pierre was working continuously on
valfendi which was orthogonal to anything Nora wanted to do, so she
never got started.)

I've seen nothing that suggested that this is any different from any of
the other topics which have been entire proposals which we consider
all-or-nothing, (and which we have no *real* alternatives between voting
yes and submitting a complete alternative proposal of at least
equivalent quality, which of course no one except you would ever manage
to get done in a short time on your own).

But enough backbiting against the policies of our noble jatna who is
still making the valiant cat-herding effort amidst a zillion other jobs %^)

If I had my druthers, now that we know that morphology is on the table
and someone other than Nora is interested in it from the byfy
specification standpoint, I would rather that Robin pull the topic off
the time-limited-voting floor, and you and Nora work together to do the
job right (in a way that satisfies her). I tend to be Nora's sounding
board at home before she ever posts something, so I will likely be
satisfied by anything that she is satisfied by.

lojbab




posts: 2388

This is all very pretty and pretty well nigh
pointless. To be sure, Lojban (and Loglan
before it) claimed that every speech string could
be broken automatically and unambiguously and
correctly into lexical units. That claim was
made originally about a much simpler language
(without fu'ivla and, indeed, without lujvo in
the current sense). In that context is was
fairly obviously true and never put to the test.
It has been carried over through numerous changes
in the language, still untested but now much less
obviously true. The present exercise seems to be
aimed at making it true of the current langauge
by fiat, without regard to what the language is
about nor how it is constructed. As a result,
decisions about language structure are being made
apparently on the basis of whether they fit into
some intended algorithm and algorithms are being
rejected on the basis of conceived possible
counterexamples without regard to their reality.
The main problems seem to arise (here as in
several other cases, e.g., "magic words") from
areas so peripheral to Lojban as to be
effectively ignorable. Here the problems seem to
be mainly borrowings from other languages, names
and fu'ivla. The one is a clearly demarcated
area which can be dealt with in a couple of words
("anything goes here") and the other — which
should probably be treated in the same way — is
meant to be a very minor matter, to be used when
we can't come up with a Lojban expression in real
time or when we want to give a discourse some
local color. Not, in short, about permanent
additons to Lojban that need rafsi and
combinatorial devices of that sort at all.
(Fu'ivla were probably a major error in judgment
to begin with, since their role is more
efficiently performed by existing devices like
non-lojban quotations. But, given that we have
them, they, hairs from the tip of the tail,
should not be used to wag the whole
morphophonemic dog.)
The additional issue of finding rafsi etc. for
cmavo seems particularly ludicrous given that a
large chunk of another bpfk thread has just been
spent in reducing almost all cmavo to predicates
in some haphazard (though officially rigorous)
way. With those two items removed, the
alogrithms for analysis seems fairly clear. If
you then want to add some of these other items
in, do so realistically. Fu'ivla are, after all,
borrowings from other languages; consider what
the possibilities are for that and don't worry
about things that can't arise from such sources
(the joke about the word for "icebox" being 20
consecutive glottal stops is just a joke, not a
situation to be dealt with). But, if you must do
this stuff, do it both ways: don't cut things off
because this mess with some other plan, deal with
what one might actually want to borrow, and don't
go looking for trouble which cannot arise from
real language cases. (On the first point, it
should be noted that cmene as now constituted
leave out at least one major device for creating
names in other languages, which probably should
find room somewhere if we are going to allow
borrowings in any systematic way: descriptive
sentences like "They are fraid of even his
horses.")


wrote:

>
> --- Bob LeChevalier wrote:
> > And that is why I pushed Nora to get
> involved, because she does
> > understand the issues, even if she is
> entirely opposed to the philosophy
> > of how the problem has been tackled (which I
> can summarize best as
> > designing without agreeing on the
> requirements, and which she mentioned
> > in her criticism of the algorithm as a
> computer program without clear
> > expectations on the result.)
>
> It's hard to satisfy everybody about
> everything.
> Some people want a formal specification of the
> morphology
> that leaves no doubt as to what is allowed and
> what is
> not, others prefer an informal one that is
> easier to read.
> Ideally we will have both at the end of this
> process.
>
> This is the design requirement I'm working
> with: "Given
> any string of sounds/characters, the morphology
> should
> return a unique correct break of the string
> into words,
> including non-lojban words."
>
> These are the constraints I'm taking as fixed
> (I may be forgetting some):
>
> 1- Any string with no spaces/pauses that
> contains at least one
> non-lojban character/sound is taken to be a
> non-lojban word.
>
> 2- Any string with no spaces/pauses that
> contains at least one
> impermissible consonant pair (per list in CLL)
> is taken to be
> a non-lojban word.
>
> 3- The apostrophe can only occur between two
> vowels in lojban
> words.
>
> 4- A space/pause is required in front of a word
> that begins with
> a vowel (beginning and end of text count as
> space/pause).
>
> 5- cmene must end with a consonant followed by
> a space/pause.
>
> 6- cmavo must be of the form zero or one
> consonant followed
> by one or more vowels, with or without
> intervening apostrophes.
>
> 7- gismu must be either CVC/CV or CCVCV, where
> CC is a
> permissible initial pair as listed in CLL.
>
> 8- gismu rafsi are of the form CVC/Cy, CCVCy,
> CVC(y), CCV, CV(')V(r/n).
>
> 9- lujvo consist of any string of rafsi that
> doesn't fail the
> tosmabru test, i.e. such that a cmavo can't be
> taken from the front
> leaving something that breaks as lojban words.
>
> 10- A fu'ivla is a string that starts with a
> permissible initial
> cluster, a consonant or a vowel, ends with a
> vowel, it is not a
> cmavo, gismu or lujvo, does not contain y, and
> passes the tosmabru
> and slinku'i tests.
>
> In addition to those constraints which I take
> as fixed, there
> are some issues to sort out:
>
> Issue1: Vowel clusters. What is allowed and
> what isn't?
> Are there any universal constraints for all
> lojban words?
> Are there specific constraints for specific
> types of words?
>
> Issue2: Consonant clusters. Do we want a
> somewhat more restrictive
> set of initial clusters for brivla than those
> allowed by CLL?
> (The proposed restrictiuon is that affricates
> (tc, ts, dj, dz)
> should not be combinable with anything else in
> initial clusters.)
> Should there be a restriction on medial
> clusters? (The proposed
> restriction is that vocalic syllables should be
> at most of the
> form initial-cluster vowel/diphthong
> single-consonant, and
> consonantal syllables of the form
> single-consonant syllabic-consonant.)
>
> Issue3: fu'ivla rafsi, general brivla rafsi,
> general cmavo rafsi,
> cmene rafsi. Should we allow them?
>
> Issue4: CVC(y) and CVV(r/n) hyphens. Should
> they be allowed in any
> position?
>
> Issue5: doi/la/lai/la'i restriction. Should we
> remove it?
>
> Issue6: Should the pause after Cy cmavo be
> obligatory, or required
> only when needed for unambiguity? Should the
> pause after cmavo
> with final-stress be obligatory, or required
> only when needed
> for unambiguity?
>
> Issue7: (Relatively minor) Should we allow
> stress in syllables
> that shouldn't be stressed but such that no
> ambiguity results?
>
> Issue8: Maybe something else I'm forgetting.
>
> mu'o mi'e xorxes
>
>
>
>
> __
> Do you Yahoo!?
> Yahoo! Mail - Helps protect you from nasty
> viruses.
> http://promotions.yahoo.com/new_mail
>
>
>



posts: 1912


> Jorge Llambías wrote:
> >jn and zn are allowed medially. They are only forbidden initially.
> >
> Then I misunderstood your question.

It was just a comment about the most glaring arbitrary restrictions.
{mz} seems to me to be the most glaring, its only justification
being that JCB found it too similar to {nz}. (Presumably he found
nj/mj, ns/ms, nt/mt, zm/zn, etc much more distinct?)

After that, the next most glaring seem to me to be the forbidding
of initial jn, jl, jr, zn, zl, zr, the voiced counterparts of
the permitted cn, cl, cr, sn, sl, sr.

Permissible initials can be grouped quite regularly:

The four affricated tc, ts, dj, dz, and any pair
in order from:
{c, s, j, z} {p, k, t, f, x, b, g, d, v, m, n} {l, r}

exceptions (besides pairs forbidden everywhere) are:
sx, jn, jl, jr, zn, zl, zr, tl, dl, nl, nr}

(cx is not listed as an exception here because it is not
permitted medially. The different treatment given to sx and cx
is also weird.)

> >The proposed change is to some initial and medial clusters of more
> >than two consonants. For me at least {tctcikptkpu} qualifies as
> >broken, or allowing initial {tctcla} but not {tla}.
> >
> This is one reason why Nora favored starting by collecting all the
> pronouncements in CLL in one place, and attempting to resolved any
> ambiguities or conflicts in the specification before trying to code an
> algorithm. We should be resolving such questions first, then writing an
> algorithm

The order makes little difference. The modifications required
to take care of such issues are mostly trivial.

> Whether we want Lojban to support loose
> borrowing of anything that can go in any other language is quite
> arguable. My own leaning is for tighter restrictions on names and
> fu'ivla than the current anything-goes policy.

We are basically in agreement about that, then.

> >Almost everything else (e.g. all the new ways of making
> >words into strange rafsi) seems new and unmotivated other than perhaps
> >by the argument "we can, so let's allow it".
> >
> >That was motivated by the embryonic but incomplete proposal in CLL.
> >
> The proposal in CLL was for one *particular* kind of fu'ivla to be given
> rafsi, which were chosen because it was pretty clear that those would
> work with no fiddling in the resolvability algorithm.

My proposal also works with no fiddling.

> It is unclear what a generalized rafsi scheme using "'y" or "iy" offers
> over the current scheme using "zei" which requires no fiddling with the
> morphology algorithm.

The main advantage is the unification of semantic words with
morphological words. {zei} constructs work semantically as
single words but morphologically as separate words, sometimes
requiring pauses and/or multiple stress.

> Either way adds a buffering syllable for the
> "joining".

Sometimes it saves syllables though:

{barda zei .iglu zei prenu}
8 syllables, three stresses.

{bardy'iglu'ypre}
6 syllables, one stress.

> Again we seem to have many misunderstandings of what is going on. Robin
> said there is a proposal and there will be a vote in 2 weeks, when Nora
> and I did not even know a proposal was being prepared - especially since
> Nora thought that she was the "shepherd" albeit without any sheep.

I have no problem in letting her do the shepherding if she wants
the job. I didn't really ask to shepherd this, I ended up there
sort of by default because of the PEG work I did.

> (I thought the PEG algorithm work that Robin was doing was a separate
> project not related to the byfy task, just as he had been working on a
> parser separate from the YACC-based grammar. Nora was given a
> quasi-autonomous subcommittee by Nick to work on morphology, which no
> one but Pierre volunteered for, and Pierre was working continuously on
> valfendi which was orthogonal to anything Nora wanted to do, so she
> never got started.)

It's hard to keep things in isolated compartments, because
many choices in one place influence what happens with other things.
I think Robin is doing an admirable job of keeping things
ordered. (His most difficult task is getting people to participate,
though.)

> I've seen nothing that suggested that this is any different from any of
> the other topics which have been entire proposals which we consider
> all-or-nothing, (and which we have no *real* alternatives between voting
> yes and submitting a complete alternative proposal of at least
> equivalent quality, which of course no one except you would ever manage
> to get done in a short time on your own).

You always have the choice of asking to split one part from the whole
if you want to concentrate on one bit, as has happened for example
with lo'e/le'e from gadri. I'm sure nobody is against compromises in
such things.

> If I had my druthers, now that we know that morphology is on the table
> and someone other than Nora is interested in it from the byfy
> specification standpoint, I would rather that Robin pull the topic off
> the time-limited-voting floor, and you and Nora work together to do the
> job right (in a way that satisfies her).

Let's start with it, and when the deadline comes, if we didn't
get anywhere we can ask for an extension. Deadline extensions
has not been a problem so far, as we have gotten them every
time we needed them.

> I tend to be Nora's sounding
> board at home before she ever posts something, so I will likely be
> satisfied by anything that she is satisfied by.

OK. How do we start?

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 1912


This is a fairly inconsequential issue, but one that
needs resolution one way or the other.

What should the official morphology do with a
string such as {MIFRAti}?

(1) Parse it as {mifra ti}.
(2) Parse it as {mi frati}.
(3) Parse it as a non-lojban word.
(4) Something else.

PEG currently does (2) on the grounds that for any cmavo followed
by a stressed syllable, if it can fall off, it will fall off, even
if it is a stressed cmavo without a following pause.

CLL in principle would not allow (2): "Lojban structural
words (called cmavo) may be stressed on any syllable or
none at all. However, primary stress may not be used in a
syllable just preceding a brivla, unless a pause divides
them; otherwise, the two words may run together."

I think CLL does not allow (1) either: "Primary stress
is required on the penultimate syllable of Lojban content
words (called brivla)."

And also: "Secondary stress can be emphasized
at any level up to primary stress, although the speaker
must not allow a false primary stress in brivla,
since errors in word resolution could result."

Since secondary stress can be emphasized up to the level
of primary stress, and a false primary stress is disallowed
in brivla (which rules out parse (1)), does this mean that
a false primary stress may be allowed in cmavo preceding a
brivla, which would open a door for parse (2)?

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 14214

On Wed, Feb 16, 2005 at 09:33:22AM -0500, Bob LeChevalier wrote:
> Again we seem to have many misunderstandings of what is going on.
> Robin said there is a proposal and there will be a vote in 2
> weeks, when Nora and I did not even know a proposal was being
> prepared - especially since Nora thought that she was the
> "shepherd" albeit without any sheep.

For the record: Nora has ignored several mails from me requesting
her opinions on Pierre's algorithm in the past. Furthermore,
noras@cox.net is on both wikidiscuss and wikichanges, where the
morphology algorithm has been extensively discussed.

As such, I had no reason to believe that she was at all interested
in participating in the morphology discussions anymore.

> (I thought the PEG algorithm work that Robin was doing was a
> separate project not related to the byfy task, just as he had been
> working on a parser separate from the YACC-based grammar.

I (and Cowan and xorxes) have been pretty clear that the PEG parser
is intended to replace the YACC grammar entirely.

FWIW, I didn't write the PEG Morphology at all; xorxes did all that.

> Nora was given a quasi-autonomous subcommittee by Nick to work on
> morphology, which no one but Pierre volunteered for, and Pierre
> was working continuously on valfendi which was orthogonal to
> anything Nora wanted to do, so she never got started.)

Ah, see, since Nora never (IIRC) answered my mails about the
morphology (none of them within the last six months, I don't think)
I thought that valfendi *was* what the morphology committee was
producing. I had no idea that Nora wasn't interested in it.

> But enough backbiting against the policies of our noble jatna who
> is still making the valiant cat-herding effort amidst a zillion
> other jobs %^)

Thanks.

> If I had my druthers, now that we know that morphology is on the
> table and someone other than Nora is interested in it from the
> byfy specification standpoint,

I gather, then, that I should assume that you two don't read
wikidiscuss or wikichanges at all?

> I would rather that Robin pull the topic off the
> time-limited-voting floor, and you and Nora work together to do
> the job right (in a way that satisfies her).

Again, past behaviour gives me no reason to believe Nora will do the
work. If she mails me privately and requests this, with some sort
of firm commitement about the time she can put in, I'd be *happy*
to do this. I'm sure xorxes would as well. (And if he's not, I'll

  • make* him happy about it. :-)


-Robin


On Wednesday 16 February 2005 15:25, Jorge "Llambías" wrote:
> This is a fairly inconsequential issue, but one that
> needs resolution one way or the other.
>
> What should the official morphology do with a
> string such as {MIFRAti}?
>
> (1) Parse it as {mifra ti}.
> (2) Parse it as {mi frati}.
> (3) Parse it as a non-lojban word.
> (4) Something else.

(1) or (2). If I had to pick one, I'd say (2); but this is a corner case,
resulting from someone inadvertently stressing a syllable, and it may be
impossible to tell which is meant without analyzing semantics.

> PEG currently does (2) on the grounds that for any cmavo followed
> by a stressed syllable, if it can fall off, it will fall off, even
> if it is a stressed cmavo without a following pause.

Valfendi also does (2).

> CLL in principle would not allow (2): "Lojban structural
> words (called cmavo) may be stressed on any syllable or
> none at all. However, primary stress may not be used in a
> syllable just preceding a brivla, unless a pause divides
> them; otherwise, the two words may run together."

The words will be misdivided if the first syllable of the brivla is unstressed
and the second syllable could be the beginning of a brivla or cmavo (e.g.
/LEkraTAIgo/ -> {lekra tai go}). If the first syllable is stressed, they can
be misdivided; this is your example. If the second syllable could not begin a
brivla or cmavo, valfendi breaks the cmavo off (e.g. /LEskalDUna/ -> {le
skalduna}).

If the stressed word before the brivla is a Cy cmavo, valfendi breaks it off.

phma
--
li ze te'a ci vu'u ci bi'e te'a mu du
li ci su'i ze te'a mu bi'e vu'u ci


posts: 1912


> On Wednesday 16 February 2005 15:25, Jorge "Llambías" wrote:
> > This is a fairly inconsequential issue, but one that
> > needs resolution one way or the other.
> >
> > What should the official morphology do with a
> > string such as {MIFRAti}?
> >
> > (1) Parse it as {mifra ti}.
> > (2) Parse it as {mi frati}.
> > (3) Parse it as a non-lojban word.
> > (4) Something else.
>
> (1) or (2). If I had to pick one, I'd say (2); but this is a corner case,
> resulting from someone inadvertently stressing a syllable, and it may be
> impossible to tell which is meant without analyzing semantics.

Yes, but we're not trying to guess what someone may have meant, just
decide what the official parse should give. An intelligent parser will
allow more leeway in interpretation, but the official morphology won't.

> > PEG currently does (2) on the grounds that for any cmavo followed
> > by a stressed syllable, if it can fall off, it will fall off, even
> > if it is a stressed cmavo without a following pause.
>
> Valfendi also does (2).
>
> > CLL in principle would not allow (2): "Lojban structural
> > words (called cmavo) may be stressed on any syllable or
> > none at all. However, primary stress may not be used in a
> > syllable just preceding a brivla, unless a pause divides
> > them; otherwise, the two words may run together."
>
> The words will be misdivided if the first syllable of the brivla is
> unstressed
> and the second syllable could be the beginning of a brivla or cmavo (e.g.
> /LEkraTAIgo/ -> {lekra tai go}). If the first syllable is stressed, they can
> be misdivided; this is your example. If the second syllable could not begin a
> brivla or cmavo, valfendi breaks the cmavo off (e.g. /LEskalDUna/ -> {le
> skalduna}).

PEG will break a finally stressed cmavo off in the following cases:

1- When followed by a brivla with initial stress e.g. {MIFRAti}.
2- When followed by a cmavo or a brivla that starts with CV, e.g. {LElujvo}
3- When followed by a word such that if you remove the first syllable
the rest is not a word, e.g. {LEskalDUna}, but also {CIskapre.}
4- When followed by a cmene, (of course the only cmavo that can be followed
without a pause by cmene are doi/la/lai/la'i) e.g. LAknedjan.

I believe that's the maximally permissive set of rules, i.e. in no
other case can a finally stressed cmavo be followed directly by a word
without causing trouble.

The rules in CLL would not allow 1, 2 or 3. Not sure about 4.

> If the stressed word before the brivla is a Cy cmavo, valfendi breaks it off.

PEG takes y to be always unstressed.

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


On Wednesday 16 February 2005 22:12, Jorge "Llambías" wrote:
> PEG will break a finally stressed cmavo off in the following cases:
>
> 1- When followed by a brivla with initial stress e.g. {MIFRAti}.
> 2- When followed by a cmavo or a brivla that starts with CV, e.g. {LElujvo}
> 3- When followed by a word such that if you remove the first syllable
> the rest is not a word, e.g. {LEskalDUna}, but also {CIskapre.}
> 4- When followed by a cmene, (of course the only cmavo that can be followed
> without a pause by cmene are doi/la/lai/la'i) e.g. LAknedjan.
>
> I believe that's the maximally permissive set of rules, i.e. in no
> other case can a finally stressed cmavo be followed directly by a word
> without causing trouble.

Sounds good to me. If I heard {CIskapre} I'd more likely take it as {ciskypre}
mispronounced.

> The rules in CLL would not allow 1, 2 or 3. Not sure about 4.
>
> > If the stressed word before the brivla is a Cy cmavo, valfendi breaks it
> > off.
>
> PEG takes y to be always unstressed.

Even in cmene?

phma
--
Sans lunettes, je ne distingue même pas les odeurs...
-Les Perles de la médecine


On Monday 14 February 2005 13:57, Jorge "Llambías" wrote:
> Right, the full form of a syllable would be:
>
> (initial-cluster / consonant / h)? (vowel / diphthong) consonant?
>
> although it is not yet clear which diphthongs would be allowed,
> and whether a syllable that begins with a vowel can follow one
> that ends with a vowel.
>
> Another possibility would be something like:
>
> (initial-cluster / consonant / h / semivowel / .) (vowel / diphthong)
> consonant?
>
> where diphthong is restricted to ai, au, ei, oi.
>
> Yet another possibility:
>
> (initial-cluster / consonant / h / semiconsonant / .) vowel (consonant /
> semiconsonant)?
>
> so that ai/au/ei/oi can't absorb a following consonant in the same
> syllable. I think I like this one.

I think a syllable should be allowed to end with "syllabic-consonant
consonant". That would allow {tarksako} (though I'm willing to change that to
{traksako} if {tarksako} is invalid) and {tirkce}.

As to cmene, how about allowing an initial consonant cluster after the final
syllable?

phma
--
Mes règles mensuelles ont lieu une fois par an.
-Les Perles de la médecine


posts: 1912


> On Wednesday 16 February 2005 22:12, Jorge "Llambías" wrote:
>
> > PEG takes y to be always unstressed.
>
> Even in cmene?

In cmene, PEG does not pay any attention to stress, since stress
plays no role in them. In the whole grammar, the character "y" and
the character "Y" are treated identically. The only capital letters
that play a role in the grammar (though not in what concerns
cmene) are A, E, I, O and U.

mu'o mi'e xorxes





__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 1912


> I think a syllable should be allowed to end with "syllabic-consonant
> consonant". That would allow {tarksako} (though I'm willing to change that to

> {traksako} if {tarksako} is invalid) and {tirkce}.

That would also allow {fasxolarkto}, and the sometimes occuring {sumpti}.
Those are still reasonably close to "normal" words. I will add that then.

> As to cmene, how about allowing an initial consonant cluster after the final
> syllable?

That's what I have now, but if arkprstr. is a valid cmene, shouldn't
arkprstn. also be allowed?

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


On Friday 18 February 2005 08:35, Jorge "Llambías" wrote:
> That would also allow {fasxolarkto}, and the sometimes occuring {sumpti}.
> Those are still reasonably close to "normal" words. I will add that then.

That is an error for {sumti}, not a real word, right? Google turned up mostly
a Latin word.

> That's what I have now, but if arkprstr. is a valid cmene, shouldn't
> arkprstn. also be allowed?

Hmm, that's a tricky one. {ark,pr,str.}, but if I try that with {arkprstn},
{stn} is not a valid initial. But I pronounce both of them with the final
letter vocalic. Does that mean anything, or is it a made-up example?

phma
--
li fi'u vu'u fi'u fi'u du li pa


On Friday 18 February 2005 08:35, Jorge "Llambías" wrote:
> That would also allow {fasxolarkto}, and the sometimes occuring {sumpti}.
> Those are still reasonably close to "normal" words. I will add that then.

That is an error for {sumti}, not a real word, right? Google turned up mostly
a Latin word.

> That's what I have now, but if arkprstr. is a valid cmene, shouldn't
> arkprstn. also be allowed?

Hmm, that's a tricky one. {ark,pr,str.}, but if I try that with {arkprstn},
{stn} is not a valid initial. But I pronounce both of them with the final
letter vocalic. Does that mean anything, or is it a made-up example?

phma
--
li fi'u vu'u fi'u fi'u du li pa


posts: 1912


> On Friday 18 February 2005 08:35, Jorge "Llambías" wrote:
> > That would also allow {fasxolarkto}, and the sometimes occuring {sumpti}.
> > Those are still reasonably close to "normal" words. I will add that then.
>
> That is an error for {sumti}, not a real word, right? Google turned up mostly
> a Latin word.

Yes, I know. Anyway, the -mpt- would be allowed, and so has to be clearly
distinguished from -mt-.

> > That's what I have now, but if arkprstr. is a valid cmene, shouldn't
> > arkprstn. also be allowed?
>
> Hmm, that's a tricky one. {ark,pr,str.}, but if I try that with {arkprstn},
> {stn} is not a valid initial. But I pronounce both of them with the final
> letter vocalic. Does that mean anything, or is it a made-up example?

Made up.

I'm still not quite sure what to do with the final cluster in cmene.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 1912


Nick's lojban bit in <http://www.tlg.uci.edu/~opoudjis/>
is evidence that people don't pay attention to the "no la
in cmene" rule or to the required pause in front of cmene
not preceded by doi/la/lai/la'i.

I hear {.icoimi'enitcionIkolas.}

mu'o mi'e xorxes





__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 14214

On Sat, Feb 19, 2005 at 10:21:22AM -0800, Jorge Llamb?as wrote:
>
> Nick's lojban bit in <http://www.tlg.uci.edu/~opoudjis/> is
> evidence that people don't pay attention to the "no la in cmene"
> rule or to the required pause in front of cmene not preceded by
> doi/la/lai/la'i.
>
> I hear {.icoimi'enitcionIkolas.}

..ie

-Robin


posts: 162

John E Clifford wrote:

>This is all very pretty and pretty well nigh
>pointless. To be sure, Lojban (and Loglan
>before it) claimed that every speech string could
>be broken automatically and unambiguously and
>correctly into lexical units.
>
I never understood it to be merely a claim, but rather as a key design
requirement. As with some others of this sort, speech-stream
resolvability was a highlight of the Jun 1960 Sci Am article, and thus
is a key component of what the public image of Loglan/Lojban always will be,

>That claim was
>made originally about a much simpler language
>(without fu'ivla and, indeed, without lujvo in
>the current sense). In that context is was
>fairly obviously true and never put to the test.
>It has been carried over through numerous changes
>in the language, still untested but now much less
>obviously true.
>
We have to *make* it true, even to the extent of restricting the
language design in those added areas like fu'ivla and experimental cmavo.

>The present exercise seems to be
>aimed at making it true of the current langauge
>by fiat,
>
Yes.

> without regard to what the language is
>about nor how it is constructed.
>
The language should have been constructed so as to make it true. It was
at the forefront of JCB's mind during GMR and in mine and Nora's and
Gary's and Tommy's in defining the new version.

>As a result,
>decisions about language structure are being made
>apparently on the basis of whether they fit into
>some intended algorithm and algorithms are being
>rejected on the basis of conceived possible
>counterexamples without regard to their reality.
>
>
This is one reason why I would prefer to keep fu'ivla more restricted in
their usage and adaptability, so as to not lead to playing games with
the morphology merely to make them more useful.

>The main problems seem to arise (here as in
>several other cases, e.g., "magic words") from
>areas so peripheral to Lojban as to be
>effectively ignorable. Here the problems seem to
>be mainly borrowings from other languages, names
>and fu'ivla. The one is a clearly demarcated
>area which can be dealt with in a couple of words
>("anything goes here") and the other — which
>should probably be treated in the same way — is
>meant to be a very minor matter,
>
I agree. I intended to keep fu'ivla a fairly limited space with people
using la'o and names for borrowings that aren't easily adapted into
Lojban phonology and morphology.

On the other hand ...
JCB spent a LOT of time trying to make his version of fu'ivla adaptable
to the importation of Linnean names as borrowings, even to the extent of
breaking TLI Loglan by the use of doubled letters and h in ways not
consistent with any other aspect of the language history. Consistency
with JCB's policy would make me more sympathetic to Pierre's extensive
efforts at pushing the limits of Lojban borrowing for the same purpose
EXCEPT that I realize that what ended up was a broken misfit to the rest
of Loglan and I don't want that mistake repeated. The language should
NOT be dominated by fu'ivla, and people should use the clumsier methods
of borrowing until/unless it is shown that a word really needs to be
borrowed. I once saw an example showing Chinese text including Linneans
in Romanized text in the middle of the Chinese characters., showing how
uncompromising they are with the language design. Lojban should do the
same rather than break the language to make borrowing "easier" but less
intuitive.

>to be used whenwe can't come up with a Lojban expression in real
>time or when we want to give a discourse some
>local color. Not, in short, about permanent
>additons to Lojban that need rafsi and
>combinatorial devices of that sort at all. (Fu'ivla were probably a major error in judgment
>
>
>to begin with, since their role is more
>efficiently performed by existing devices like
>non-lojban quotations.
>
I would agree, except that JCB was correct in realizing that the market
for Lojban is mostly technical people, and technical people use a lot of
jargon which is best dealt with as fu'ivla. Non-Lojban quotation for
frequently used jargon would make the language ridiculous. But the key
should be "frequently used", and words should see that frequent use or
occur in ways that make non-lojban quotation especially clumsy (e.g. any
time a borrowing is to be the selbri of a sentence, I think it needs to
be a Lojban word and not a converted quotation of non-Lojban).

The second exception is my concession to the battle over culture words.
Accepting that the set of culture words in the language is inherently
biased, we need ways to make maximally useful culture words for the
cultures that are not given gismu. This was made evident when two of
our earliest skilled Lojbanists were Finnish and Bulgarian. It was to
deal with this problem that we stretched to allow rafsi fu'ivla as an
extension to gismu space.

All other sorts of compounding can be adequately dealt with using zei,
until and unless we find a large number of instances of zei compounds
that fit some pattern, at which point we should deal with the problem at
that time and not before. I rejected JCB's breaking of the language to
allow name and lerfu lujvo (eg a lujvo for "X-ray)


> But, given that we have
>them, they, hairs from the tip of the tail,
>should not be used to wag the whole
>morphophonemic dog.)
>
>
Except for the matters I noted, I agree.

>But, if you must do
>this stuff, do it both ways: don't cut things off
>because this mess with some other plan, deal with
>what one might actually want to borrow, and don't
>go looking for trouble which cannot arise from
>real language cases.
>
I tend to agree, but part of the problem is that we don't really have
that much expertise in non-IndoEuropean languages and what sorts of
things someone might want to borrow from and what problems might arise.
I thus can see the sense in testing the algorithm for robustness if
someone tries to borrow something unexpected, even though I don't favor
stretching the language to make more things borrowable without
mangling. There simply is no way that Lojban will adequately allow
borrowing from click languages %^)

> (On the first point, it
>should be noted that cmene as now constituted
>leave out at least one major device for creating
>names in other languages, which probably should
>find room somewhere if we are going to allow
>borrowings in any systematic way: descriptive
>sentences like "They are fraid of even his
>horses.")
>
>
Explicitly dealt with long ago. You can put "la" on any descriptive
selbri. Ignoring the "even" in the above (which is one of those
translation problems I never saw agreement on)
la terpa be le xirma
is perfectly acceptable Lojban.

lojbab



posts: 1912


> I intended to keep fu'ivla a fairly limited space with people
> using la'o and names for borrowings that aren't easily adapted into
> Lojban phonology and morphology.

I don't see fu'ivla competing with la'o and cmevla at all.
The competitors of fu'ivla are lujvo, whereas la'o and cmevla
compete between themselves in a different area.

For example, in trying to find a word for "crocodile", we may
choose between the fu'ivla {krokodilo}, {resprkrokodilo}
or some lujvo {***respa}, but no cmevla or la'o would be
appropriate. {la krokodail} or {la'o gy crocodile gy} would
most likely refer to a person named Crocodile.

Similarly, for the name "Enrique", we can choose between
for example {la .enrikes.} or {la'o sy Enrique sy}, but
making a fu'ivla {.enrike} would be silly.

So I don't really see fu'ivla having much to do with cmevla
or la'o.

> > (On the first point, it
> >should be noted that cmene as now constituted
> >leave out at least one major device for creating
> >names in other languages, which probably should
> >find room somewhere if we are going to allow
> >borrowings in any systematic way: descriptive
> >sentences like "They are fraid of even his
> >horses.")
> >
> Explicitly dealt with long ago. You can put "la" on any descriptive
> selbri. Ignoring the "even" in the above (which is one of those
> translation problems I never saw agreement on)
> la terpa be le xirma
> is perfectly acceptable Lojban.

"They are fraid of even his horses" is more like
{la prenu poi lo ke'a xirma ji'a sai cu se terpa},
i.e. not one who is afraid of horses but one whose
horses even cause fear in others.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 2388



> John E Clifford wrote:
>
> >This is all very pretty and pretty well nigh
> >pointless. To be sure, Lojban (and Loglan
> >before it) claimed that every speech string
> could
> >be broken automatically and unambiguously and
> >correctly into lexical units.
> >
> I never understood it to be merely a claim, but
> rather as a key design
> requirement. As with some others of this sort,
> speech-stream
> resolvability was a highlight of the Jun 1960
> Sci Am article, and thus
> is a key component of what the public image of
> Loglan/Lojban always will be,

Yes, it was a design parameter, but, as noted,
not one ever actually tested for. JCB — and the
rest of us, most of the time — simply intuited
that it was working — or not.

> >That claim was
> >made originally about a much simpler language
> >(without fu'ivla and, indeed, without lujvo in
> >the current sense). In that context is was
> >fairly obviously true and never put to the
> test.
> >It has been carried over through numerous
> changes
> >in the language, still untested but now much
> less
> >obviously true.
> >
> We have to *make* it true, even to the extent
> of restricting the
> language design in those added areas like
> fu'ivla and experimental cmavo.

Again, not something ever tested for until very
recently, when it was discovered that it didn't
work very well at all.

> >The present exercise seems to be
> >aimed at making it true of the current
> langauge
> >by fiat,
> >
> Yes.
>
> > without regard to what the language is
> >about nor how it is constructed.
> >
> The language should have been constructed so as
> to make it true. It was
> at the forefront of JCB's mind during GMR and
> in mine and Nora's and
> Gary's and Tommy's in defining the new version.
>
> >As a result,
> >decisions about language structure are being
> made
> >apparently on the basis of whether they fit
> into
> >some intended algorithm and algorithms are
> being
> >rejected on the basis of conceived possible
> >counterexamples without regard to their
> reality.
> >
> >
> This is one reason why I would prefer to keep
> fu'ivla more restricted in
> their usage and adaptability, so as to not lead
> to playing games with
> the morphology merely to make them more useful.

I am now inclined to think it is a reason — and
more come later in this essay — to do away with
fuhivla altogether and introduce borrowings in an
entirely different way, akin to quotations but
syntactically more flexible. That is, demarcated
items that are not looked at internally but just
taken as a lump, with (virtually) anything
allowed within the demarcations.

> >The main problems seem to arise (here as in
> >several other cases, e.g., "magic words") from
> >areas so peripheral to Lojban as to be
> >effectively ignorable. Here the problems seem
> to
> >be mainly borrowings from other languages,
> names
> >and fu'ivla. The one is a clearly demarcated
> >area which can be dealt with in a couple of
> words
> >("anything goes here") and the other — which
> >should probably be treated in the same way --
> is
> >meant to be a very minor matter,
> >
> I agree. I intended to keep fu'ivla a fairly
> limited space with people
> using la'o and names for borrowings that aren't
> easily adapted into
> Lojban phonology and morphology.
>
> On the other hand ...
> JCB spent a LOT of time trying to make his
> version of fu'ivla adaptable
> to the importation of Linnean names as
> borrowings, even to the extent of
> breaking TLI Loglan by the use of doubled
> letters and h in ways not
> consistent with any other aspect of the
> language history. Consistency
> with JCB's policy would make me more
> sympathetic to Pierre's extensive
> efforts at pushing the limits of Lojban
> borrowing for the same purpose
> EXCEPT that I realize that what ended up was a
> broken misfit to the rest
> of Loglan and I don't want that mistake
> repeated. The language should
> NOT be dominated by fu'ivla, and people should
> use the clumsier methods
> of borrowing until/unless it is shown that a
> word really needs to be
> borrowed. I once saw an example showing
> Chinese text including Linneans
> in Romanized text in the middle of the Chinese
> characters., showing how
> uncompromising they are with the language
> design. Lojban should do the
> same rather than break the language to make
> borrowing "easier" but less
> intuitive.

Amen. Although "easy and more intuitive" would
be nice — with "easy" being the first to go.

> >to be used when we can't come up with a Lojban
> expression in real
> >time or when we want to give a discourse some
> >local color. Not, in short, about permanent
> >additons to Lojban that need rafsi and
> >combinatorial devices of that sort at all.
> (Fu'ivla were probably a major error in
> judgment
> >
> >
> >to begin with, since their role is more
> >efficiently performed by existing devices like
> >non-lojban quotations.
> >
> I would agree, except that JCB was correct in
> realizing that the market
> for Lojban is mostly technical people, and
> technical people use a lot of
> jargon which is best dealt with as fu'ivla.
> Non-Lojban quotation for
> frequently used jargon would make the language
> ridiculous. But the key
> should be "frequently used", and words should
> see that frequent use or
> occur in ways that make non-lojban quotation
> especially clumsy (e.g. any
> time a borrowing is to be the selbri of a
> sentence, I think it needs to
> be a Lojban word and not a converted quotation
> of non-Lojban).

The solution is not to permanentize borrowing but
to simplify indigeneous language expansion. If
we need jargon, create Lojban jargon — perhaps
based on borrowed forms, but fully Lojban
internally.

> The second exception is my concession to the
> battle over culture words.
> Accepting that the set of culture words in the
> language is inherently
> biased, we need ways to make maximally useful
> culture words for the
> cultures that are not given gismu. This was
> made evident when two of
> our earliest skilled Lojbanists were Finnish
> and Bulgarian. It was to
> deal with this problem that we stretched to
> allow rafsi fu'ivla as an
> extension to gismu space.

This seems to be an example of what I take to be
the way to deal with innovations with staying
power (though I don't see the culture words
getting a lot of actual use, just political
correctness).

> All other sorts of compounding can be
> adequately dealt with using zei,
> until and unless we find a large number of
> instances of zei compounds
> that fit some pattern, at which point we should
> deal with the problem at
> that time and not before. I rejected JCB's
> breaking of the language to
> allow name and lerfu lujvo (eg a lujvo for
> "X-ray)
>
>
> > But, given that we have
> >them, they, hairs from the tip of the tail,
> >should not be used to wag the whole
> >morphophonemic dog.)
> >
> >
> Except for the matters I noted, I agree.
>
> >But, if you must do
> >this stuff, do it both ways: don't cut things
> off
> >because this mess with some other plan, deal
> with
> >what one might actually want to borrow, and
> don't
> >go looking for trouble which cannot arise from
> >real language cases.
> >
> I tend to agree, but part of the problem is
that >we don't really have
>that much expertise in non-IndoEuropean
languages >and what sorts of
>things someone might want to borrow from and
what >problems might arise.
>I thus can see the sense in testing the
algorithm >for robustness if
>someone tries to borrow something unexpected,
>even though I don't favor
>stretching the language to make more things
>borrowable without
>mangling. There simply is no way that Lojban
>will adequately allow
>borrowing from click languages %^)

A good argument for placing no limits on what can
go into borrowings. At the worst, Lojban
pronunciation will be buffered (we don't seem to
talk about that much any more), the doubly
articulated stops getting separated
articulations, for example — just as they do in
English and most non-African languages, and as
"simple" compounds do in rigorously CV languages.

>> (On the first point, it
>>should be noted that cmene as now constituted
>>leave out at least one major device for
creating
>>names in other languages, which probably should
>>find room somewhere if we are going to allow
>>borrowings in any systematic way: descriptive
>>sentences like "They are fraid of even his
>>horses.")
>>
>>
>Explicitly dealt with long ago. You can put "la"
>on any descriptive
>selbri. Ignoring the "even" in the above (which
>is one of those
>translation problems I never saw agreement on)
>la terpa be le xirma
>is perfectly acceptable Lojban.

Dealt with and screwed up. The result is the old
denigrating English version "Afraid of horses"
rather than the original which has the named's
horses the object of fear to everyone else, not
other horses the object of fear to the named.
How would the real name go into Lojban? The same
discussion which came up with {la terpa be le
xirma} failed to come up with this one; the
nearest was something like "behorsed by what his
feared by all" but "behorsed" didn't come very
easily.




posts: 2388


wrote:

>
> --- Bob LeChevalier wrote:
> > I intended to keep fu'ivla a fairly limited
> space with people
> > using la'o and names for borrowings that
> aren't easily adapted into
> > Lojban phonology and morphology.
>
> I don't see fu'ivla competing with la'o and
> cmevla at all.
> The competitors of fu'ivla are lujvo, whereas
> la'o and cmevla
> compete between themselves in a different area.
>
> For example, in trying to find a word for
> "crocodile", we may
> choose between the fu'ivla {krokodilo},
> {resprkrokodilo}
> or some lujvo {***respa}, but no cmevla or la'o
> would be
> appropriate. {la krokodail} or {la'o gy
> crocodile gy} would
> most likely refer to a person named Crocodile.
>
> Similarly, for the name "Enrique", we can
> choose between
> for example {la .enrikes.} or {la'o sy Enrique
> sy}, but
> making a fu'ivla {.enrike} would be silly.
>
> So I don't really see fu'ivla having much to do
> with cmevla
> or la'o.

The similarity is just that they all arise feom
trying to deal with non-Lojban in Lojban. For
both cmevla and fuihivla the attempt is to put
some Lojban controls on this dealing. {la'o}
leaves the original language pretty much intact
but demarcates its occurrence, as cmevla and
fuhivla do not (at least not nearly as clearly).
Clearly, cmevla and fuhivla fall into diffeerent
grammatical categories and thus do not conflict,
although, in fact, the twoo categories are
interdefinable: we can reduce sumti to predicates
and predicates to sumti within standard Lojban.
So, in fact, simplification and clarification in
one area can easily bring simplificity and
clarity in both.

> > > (On the first point, it
> > >should be noted that cmene as now
> constituted
> > >leave out at least one major device for
> creating
> > >names in other languages, which probably
> should
> > >find room somewhere if we are going to allow
> > >borrowings in any systematic way:
> descriptive
> > >sentences like "They are fraid of even his
> > >horses.")
> > >
> > Explicitly dealt with long ago. You can put
> "la" on any descriptive
> > selbri. Ignoring the "even" in the above
> (which is one of those
> > translation problems I never saw agreement
> on)
> > la terpa be le xirma
> > is perfectly acceptable Lojban.
>
> "They are fraid of even his horses" is more
> like
> {la prenu poi lo ke'a xirma ji'a sai cu se
> terpa},
> i.e. not one who is afraid of horses but one
> whose
> horses even cause fear in others.
>
Well "even whose horses," but surely better. Not
very snappy though.


On Monday 21 February 2005 09:52, Jorge "Llambías" wrote:
> For example, in trying to find a word for "crocodile", we may
> choose between the fu'ivla {krokodilo}, {resprkrokodilo}
> or some lujvo {***respa}, but no cmevla or la'o would be
> appropriate. {la krokodail} or {la'o gy crocodile gy} would
> most likely refer to a person named Crocodile.

You're forgetting the type-1 and type-2 fu'ivla, which are respectively {me
la'o zoi Crocodylus. zoi} and {me la krokodilos.}.

phma
--
Without glasses, I can't even distinguish smells...
-Les Perles de la médecine


posts: 1912


> On Monday 21 February 2005 09:52, Jorge "Llambías" wrote:
> > For example, in trying to find a word for "crocodile", we may
> > choose between the fu'ivla {krokodilo}, {resprkrokodilo}
> > or some lujvo {***respa}, but no cmevla or la'o would be
> > appropriate. {la krokodail} or {la'o gy crocodile gy} would
> > most likely refer to a person named Crocodile.
>
> You're forgetting the type-1 and type-2 fu'ivla, which are respectively {me
> la'o zoi Crocodylus. zoi} and {me la krokodilos.}.

But they are not equivalent. If {la krokodilos} refers to a person
called "Crocodeelos", then {me la krokodilos} means "x1 is Crocodeelos".
Not the same as {krokodilo}.

I would take {me la'o zoi. Crocodylus .zoi.} to be "x1 is the species
Crocodylus". We could use {danlu be la'o zoi. Crocodylus .zoi.}
instead of {krokodilo}, but that's an indirect use of la'o.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 1912


> The similarity is just that they all arise feom
> trying to deal with non-Lojban in Lojban. For
> both cmevla and fuihivla the attempt is to put
> some Lojban controls on this dealing. {la'o}
> leaves the original language pretty much intact
> but demarcates its occurrence, as cmevla and
> fuhivla do not (at least not nearly as clearly).

la'o and zoi demarcate non-Lojban, yes.

cmevla and fu'ivla are Lojban as much as cmavo,
gismu and lujvo. That's why they have to comply
with Lojban morphology.

> > > > (On the first point, it
> > > >should be noted that cmene as now
> > constituted
> > > >leave out at least one major device for
> > creating
> > > >names in other languages, which probably
> > should
> > > >find room somewhere if we are going to allow
> > > >borrowings in any systematic way:
> > descriptive
> > > >sentences like "They are fraid of even his
> > > >horses.")
> >
> > {la prenu poi lo ke'a xirma ji'a sai cu se
> > terpa},
> > i.e. not one who is afraid of horses but one
> > whose
> > horses even cause fear in others.
> >
> Well "even whose horses," but surely better. Not
> very snappy though.

Just as snappy as the corresponding description.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > The similarity is just that they all arise
> feom
> > trying to deal with non-Lojban in Lojban.
> For
> > both cmevla and fuihivla the attempt is to
> put
> > some Lojban controls on this dealing. {la'o}
> > leaves the original language pretty much
> intact
> > but demarcates its occurrence, as cmevla and
> > fuhivla do not (at least not nearly as
> clearly).
>
> la'o and zoi demarcate non-Lojban, yes.
>
> cmevla and fu'ivla are Lojban as much as cmavo,
> gismu and lujvo. That's why they have to comply
> with Lojban morphology.

And therein lies the problem. foreign words are
not Lojban and are inevitably screwed over in
trying to make them such. Once the concept --
and its word — are naturalized to Lojaban they
can become perfectly normal basic Lojban, until
then it is simpler to leave them foreign, given
that Lojban has fairly strict phonetic
constraints.


> > > > > (On the first point, it
> > > > >should be noted that cmene as now
> > > constituted
> > > > >leave out at least one major device for
> > > creating
> > > > >names in other languages, which probably
> > > should
> > > > >find room somewhere if we are going to
> allow
> > > > >borrowings in any systematic way:
> > > descriptive
> > > > >sentences like "They are fraid of even
> his
> > > > >horses.")
> > >
> > > {la prenu poi lo ke'a xirma ji'a sai cu se
> > > terpa},
> > > i.e. not one who is afraid of horses but
> one
> > > whose
> > > horses even cause fear in others.
> > >
> > Well "even whose horses," but surely better.
> Not
> > very snappy though.
>
> Just as snappy as the corresponding
> description.
>
In English, I suppose you meant. Not quite true,
as a count show, but the goal is something like
the original, which {terpa le xirma} at least
approached were it not otherwise totally
miguided.


posts: 1912


> > cmevla and fu'ivla are Lojban as much as cmavo,
> > gismu and lujvo. That's why they have to comply
> > with Lojban morphology.
>
> And therein lies the problem. foreign words are
> not Lojban and are inevitably screwed over in
> trying to make them such.

cmevla and fu'ivla are not foreign words in Lojban.

> Once the concept --
> and its word — are naturalized to Lojaban they
> can become perfectly normal basic Lojban, until
> then it is simpler to leave them foreign, given
> that Lojban has fairly strict phonetic
> constraints.

As long as they remain foreign, they are not cmevla
or fu'ivla (at least not morphological fu'ivla, i.e.
those forms with general Lojban morphological constraints,
that end in a vowel, have penultimate stress and cannot
break as cmavo or lujvo).

fu'ivla are in most respects like gismu, since gismu too
were based on foreign words for their creation.

The main differences are:

1- most gismu were based on five foreign words, whereas most
fu'ivla are based on just one (but there are exceptions in
both cases).
2- gismu have a much more restricted morphological space to
accomodate the source word forms, and more words to accomodate
in one form, so the distortions are much bigger. (In fact most
source words are unrecognizable in most gismu.)
3- gismu is a practically closed class, whereas fu'ivla is
an open one.

But other than their origin, once they are created they work
in all respects in the same way. They both differ from lujvo
in that they don't have an internal structure.

> > > > {la prenu poi lo ke'a xirma ji'a sai cu se
> > > > terpa},
> > >
> > > Not
> > > very snappy though.
> >
> > Just as snappy as the corresponding
> > description.
> >
> In English, I suppose you meant.

No, I mean in Lojban:

{la prenu poi lo ke'a xirma ji'a sai cu se terpa}
vs. {le prenu poi lo ke'a xirma ji'a sai cu se terpa}.

If you can think of a snappier description in Lojban,
you automatically get a snappier name.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 162

John E Clifford wrote:
> --- Bob LeChevalier <lojbab@lojban.org> wrote:
>>
> Yes, it was a design parameter, but, as noted,
> not one ever actually tested for. JCB — and the
> rest of us, most of the time — simply intuited
> that it was working — or not.

I'm not sure what sort of testing went into GMR, but I though that it
was more than intuition. I think he got sloppy when he added in Linnean
borrowings though.

>>We have to *make* it true, even to the extent
>>of restricting the
>>language design in those added areas like
>>fu'ivla and experimental cmavo.
>
> Again, not something ever tested for until very
> recently, when it was discovered that it didn't
> work very well at all.

It does, provided that people don't try to push the limits. Remember
that adding type IV fu'ivla was expected to be something unusual, based
on usage, and would not be ad-hoc, and would not be allowed to push the
limits. Type III fu'ivla with the rafsi on the front, don't break
anything, so far as I know.

>>This is one reason why I would prefer to keep
>>fu'ivla more restricted in
>>their usage and adaptability, so as to not lead
>>to playing games with
>>the morphology merely to make them more useful.
>
>
> I am now inclined to think it is a reason — and
> more come later in this essay — to do away with
> fuhivla altogether and introduce borrowings in an
> entirely different way, akin to quotations but
> syntactically more flexible.

We already have that in the form of la'o quotes. That was always
intended to be the primary mode of ad-hoc borrowing, but no one ever
used it, preferring to make Type IIIs, and in Pierre's case Type IVs
rampantly, and without especial regard for whether they broke the
restrictions that were built into the morphology based on algorithmic
needs. Usage decided against what you and I might prefer (I'll admit
that I did not use la'o much either unless the borrowing was from a
language/word whose phonology would be mangled by Lojbanizing it.)

I think people want to be able to use borrowings in tanru (and in some
cases in lujvo), and while the language has constructions that allow a
tanru usage for la'o, few people stop to puzzle them out.

> That is, demarcated
> items that are not looked at internally but just
> taken as a lump, with (virtually) anything
> allowed within the demarcations.

Precisely what la'o does. No changes needed, just a willingness to use
what is built into the language.

> The solution is not to permanentize borrowing but
> to simplify indigeneous language expansion. If
> we need jargon, create Lojban jargon — perhaps
> based on borrowed forms, but fully Lojban
> internally.

I'd love that, except that it goes against the grain of those who want
semantically analytical lujvo. Metaphorical lujvo would be the most
likely way to create Lojban-internal jargon, possibly adding on a
classifier-rafsi like is done for type III fu'ivla. But those methods
don't stand a chance when the primary borrowings being created are words
for the 20 million plus Linnean names and culture words and the
occasional bit of computer terminology.

>>The second exception is my concession to the
>>battle over culture words.
>>Accepting that the set of culture words in the
>>language is inherently
>>biased, we need ways to make maximally useful
>>culture words for the
>>cultures that are not given gismu. This was
>>made evident when two of
>>our earliest skilled Lojbanists were Finnish
>>and Bulgarian. It was to
>>deal with this problem that we stretched to
>>allow rafsi fu'ivla as an
>>extension to gismu space.
>
> This seems to be an example of what I take to be
> the way to deal with innovations with staying
> power (though I don't see the culture words
> getting a lot of actual use, just political
> correctness).

I think that they get more use than any other fu'ivla. More time is
spent making fu'ivla than actually using them, I suspect.

>>>names in other languages, which probably should
>>>find room somewhere if we are going to allow
>>>borrowings in any systematic way: descriptive
>>>sentences like "They are fraid of even his
>>>horses.")
>>
>>Explicitly dealt with long ago. You can put "la"
>>on any descriptive
>>selbri. Ignoring the "even" in the above (which
>>is one of those
>>translation problems I never saw agreement on)
>>la terpa be le xirma
>>is perfectly acceptable Lojban.
>
> Dealt with and screwed up. The result is the old
> denigrating English version "Afraid of horses"
> rather than the original which has the named's
> horses the object of fear to everyone else, not
> other horses the object of fear to the named.

That just means that I was too lazy to translate the description
properly, not that we can't translate it properly. I did not carefully
analyze the original, and so missed the role of the person being named
was not to be afraid.

la prenu poi le ke'a xirma cu banzu lenu selte'a
Person-whose-horses-are-sufficient-to-evoke-fear

You could be more explicit than the original and say "le ke'a xirma ja
ke'a" - either he or his horses suffice to cause fear - which I did not
read into your original until you explained it.

> How would the real name go into Lojban? The same
> discussion which came up with {la terpa be le
> xirma}

I just came up with it while typing that message; I didn't discuss it,
and alas I didn't think much about the expression to realize it was a
triviality like "Dances-With-Wolves" (which of course is easy to
translate literally, but difficult to capture connotations) Maybe
someone else made the same error I did in some prior discussion, but I
know of no such discussion.

>failed to come up with this one; the
> nearest was something like "behorsed by what his
> feared by all" but "behorsed" didn't come very
> easily.

If you want the "all" in there, which was not evident from the "They" in
English, add "da'ada" to the end (evokes fear in all-but-one i.e. himself).

The real weakness is that Lojban probably needs more UI3 discursives (or
previously semantically-analyzed brivla) for things like "even", "mere",
"just", along with the po'o "only". There are those who want these
logically analytical, and those who want a short form that saves the
analysis. Lacking the analysis in a form short enough to say without
thinking, people simply skip trying to express it and make errors. I
think this is one of the real shortcomings of the language at present;
words that are common in other language have no easy expression in
Lojban, and few are competent to analyze the logical implications of the
words they are trying to translate.

Having failed to get cmavo for them, at this point I would try for
brivla and not try to make them dikyjvo, but coming up with a list, and
a good pattern for making them as lujvo is something I don't feel
linguistically or logically competent to do.

lojbab



posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > > cmevla and fu'ivla are Lojban as much as
> cmavo,
> > > gismu and lujvo. That's why they have to
> comply
> > > with Lojban morphology.
> >
> > And therein lies the problem. foreign words
> are
> > not Lojban and are inevitably screwed over in
> > trying to make them such.
>
> cmevla and fu'ivla are not foreign words in
> Lojban.

Well, we are playing with an ambiguity here,
"foreign words in Lojban." They are Lojban words
by fiat, but are meant to represent foreign words
under a variety of arbitrary restrictions
(arbitrary with relation to the original words,
systematic within Lojban). As such, they clearly
complicate Lojban and often result in
unrecognizable versions of the original word. In
both ways they lower the value of Lojban below
what it might have if it simply allowed foreign
words — marked as such — to be used freely. To
be sure, occasionally such a word would become so
commonly needed as to suggest that it — or some
version of it — become an nativized word. But
those occasions are going to be rare (after a
short initial period of getting the current word
stock up to speed) and not worth complicating the
rest of the system for.

> > Once the concept --
> > and its word — are naturalized to Lojaban
> they
> > can become perfectly normal basic Lojban,
> until
> > then it is simpler to leave them foreign,
> given
> > that Lojban has fairly strict phonetic
> > constraints.
>
> As long as they remain foreign, they are not
> cmevla
> or fu'ivla (at least not morphological fu'ivla,
> i.e.
> those forms with general Lojban morphological
> constraints,
> that end in a vowel, have penultimate stress
> and cannot
> break as cmavo or lujvo).

This is, of course, precisely what I am
suggesting be changed (well, not precisely, but
near enough). That is, as with most languages
Lojban could have a means of using foreign
expressions in Lojban syntax without having to
make them over into Lojban words. There are
already a variety of — unduly complex — devices
for this in Lojban but they are rarely used since
there seems to be a perverse pleasure in creating
fuhivla horrors to obscure largely unneeded
foreign words.

> fu'ivla are in most respects like gismu, since
> gismu too
> were based on foreign words for their creation.
>
> The main differences are:
>
> 1- most gismu were based on five foreign words,
> whereas most
> fu'ivla are based on just one (but there are
> exceptions in
> both cases).
> 2- gismu have a much more restricted
> morphological space to
> accomodate the source word forms, and more
> words to accomodate
> in one form, so the distortions are much
> bigger. (In fact most
> source words are unrecognizable in most gismu.)
> 3- gismu is a practically closed class, whereas
> fu'ivla is
> an open one.
>
> But other than their origin, once they are
> created they work
> in all respects in the same way. They both
> differ from lujvo
> in that they don't have an internal structure.

Relevance? The current situation is clear, as are
the difficulties that ensue from it. That does
not speak to the desirability of changing that
situation nor the excesses that have given rise
to it.

> > > > > {la prenu poi lo ke'a xirma ji'a sai cu
> se
> > > > > terpa},
> > > >
> > > > Not
> > > > very snappy though.
> > >
> > > Just as snappy as the corresponding
> > > description.
> > >
> > In English, I suppose you meant.
>
> No, I mean in Lojban:
>
> {la prenu poi lo ke'a xirma ji'a sai cu se
> terpa}
> vs. {le prenu poi lo ke'a xirma ji'a sai cu se
> terpa}.
>
> If you can think of a snappier description in
> Lojban,
> you automatically get a snappier name.
>



posts: 162

Jorge Llambas wrote:
> --- John E Clifford wrote:
>>>cmevla and fu'ivla are Lojban as much as cmavo,
>>>gismu and lujvo. That's why they have to comply
>>>with Lojban morphology.
>>
>>And therein lies the problem. foreign words are
>>not Lojban and are inevitably screwed over in
>>trying to make them such.
>
> cmevla and fu'ivla are not foreign words in Lojban.

In one sense, correct, which is why they should be made carefully.
Adhockery should suffice for communication but not be considered a
full-fledged part of the language, and we don't have enough usage to
warrant going beyond adhockery in most cases of cmevla and fu'ivla.

On the other hand, in another sense they are still foreign words in
Lojban, in the same sense that "deja vu", "esprit de corps", "ad hoc"
and other borrowings aren't really English. They haven't really lost
their foreign quality, and indeed are often italicized in English to
emphasize this. By comparison, brassiere has become a true borrowing,
since it apparently doesn't have the same meaning in the French from
which it was borrowed - it has been fully Anglicized as to meaning. I
use "ad hoc" as a true borrowing as well, as one can see by my use of
"adhockery" in this message, which uses a pattern of word formation
permissible for real English words. But that is not standard English,
so "ad hoc" is not yet a standard English word.

The corresponding process for Lojban would be when a fu'ivla is
sufficiently well-defined so as to have a non-trivial multi-place place
structure - something whose meaning goes beyond the process which
created it. Then and only then can we clearly say that a fu'ivla is as
much a Lojban word as a gismu, since gismu were in fact designed to have
meaning only loosely tied to the specific denotation of the words from
which they were derived.

>>Once the concept --
>>and its word — are naturalized to Lojaban they
>>can become perfectly normal basic Lojban, until
>>then it is simpler to leave them foreign, given
>>that Lojban has fairly strict phonetic
>>constraints.
>
> As long as they remain foreign, they are not cmevla
> or fu'ivla (at least not morphological fu'ivla, i.e.
> those forms with general Lojban morphological constraints,
> that end in a vowel, have penultimate stress and cannot
> break as cmavo or lujvo).
>
> fu'ivla are in most respects like gismu, since gismu too
> were based on foreign words for their creation.

I disagree because of the lack of care in defining them as predicates,
and the lack of consideration for their being useful in the ways that
gismu are designed to be useful.

> The main differences are:
>
> 1- most gismu were based on five foreign words, whereas most
> fu'ivla are based on just one (but there are exceptions in
> both cases).
> 2- gismu have a much more restricted morphological space to
> accomodate the source word forms, and more words to accomodate
> in one form, so the distortions are much bigger. (In fact most
> source words are unrecognizable in most gismu.)
> 3- gismu is a practically closed class, whereas fu'ivla is
> an open one.

Those are the morphological differences, which stem trivially from the
how the words were made. The reason why gismu are in a class unto
themselves is more than morphological. They are considered the
fundamental roots of the language from which most other true-Lojban
words should be built. They were designed to have predicate semantics
with multiple places, and not merely be nouns, verbs, or adjectives.
They were often chosen specifically because they would be useful in
tanru or lujvo, and needed short forms because of the implications of
that usefulness.

fu'ivla tend to retain the grammatical nature of the words that they
were borrowed from. If the word coiner bothers to make a place
structure for the fu'ivla, it will be because he can copy a pattern of a
similar word, e.g. x1 is a XXXX of species/breed x2.

> But other than their origin, once they are created they work
> in all respects in the same way. They both differ from lujvo
> in that they don't have an internal structure.

But gismu differ because they have real place structures, in which
considerable time was invested to make them useful, and it was intended
that they be usable in multiple ways. fu'ivla that are coined ad hoc
simply to fill a need don't get that investment, and words coined
systematically like the animal and plant species are have only slightly
more thought put into them.

By making a word from 6 (not 5) source languages, we commit to the word
having a meaning independent of any one language. A fu'ivla that is not
similarly made doesn't have that effort at independent meaning.

lojbab




posts: 1912


> John E Clifford wrote:
> > Again, not something ever tested for until very
> > recently, when it was discovered that it didn't
> > work very well at all.
>
> It does, provided that people don't try to push the limits. Remember
> that adding type IV fu'ivla was expected to be something unusual, based
> on usage, and would not be ad-hoc, and would not be allowed to push the
> limits. Type III fu'ivla with the rafsi on the front, don't break
> anything, so far as I know.

Whatever brokenness are you two talking about?

Neither type III, nor type IV fu'ivla break anything in
Lojban with respect to ambiguity in parsing of the speech
stream. All we've been discussing is how permissive we
should be with respect to consonant clusters and consonant
vowels in lojbanizing words, but there is no question of
any ambiguities arising in any case.

> We already have that in the form of la'o quotes. That was always
> intended to be the primary mode of ad-hoc borrowing, but no one ever
> used it, preferring to make Type IIIs, and in Pierre's case Type IVs
> rampantly, and without especial regard for whether they broke the
> restrictions that were built into the morphology based on algorithmic
> needs.

Nonsense. Pierre's fu'ivla follow CLL's rules very carefully and don't
break any restrictions imposed on the morphology.

> Usage decided against what you and I might prefer (I'll admit
> that I did not use la'o much either unless the borrowing was from a
> language/word whose phonology would be mangled by Lojbanizing it.)

There is hardly any usage of fu'ivla so I don't know what you
base this conclusion of what usage has decided on.

> > That is, demarcated
> > items that are not looked at internally but just
> > taken as a lump, with (virtually) anything
> > allowed within the demarcations.
>
> Precisely what la'o does. No changes needed, just a willingness to use
> what is built into the language.

{la'o} is for names though, not for general words. You could eventually
use {la'e zoi} for general words.

> More time is
> spent making fu'ivla than actually using them, I suspect.

Indeed. And that's not a bad thing, if you think about it.
That means fu'ivla are being created in the right categories
(such as animal names) where it is useful to have complete
lists. People don't tend to create them on-the-fly.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com




posts: 1912


> They are Lojban words
> by fiat, but are meant to represent foreign words
> under a variety of arbitrary restrictions
> (arbitrary with relation to the original words,
> systematic within Lojban).

Apparently we have very different ideas of what
fu'ivla are. For me a word like {krokodilo} is the
ordinary Lojban word for English "crocodile" or
Spanish "cocodrilo". Obviously they all share the
same etymology, but that does not make them any less
Lojban/English/Spanish respectively.

> As such, they clearly
> complicate Lojban and often result in
> unrecognizable versions of the original word.

The original word need not be recognizable. You may
not even know what language it was borrowed from. In
order to use or understand it, all you need to know
is its Lojban meaning.

> In
> both ways they lower the value of Lojban below
> what it might have if it simply allowed foreign
> words — marked as such — to be used freely.

Not in my book. Lojban with embedded la'o and zoi
quoted strings when the quoted stuff is not relevantly
foreign looks decidedly ugly to me.

> There are
> already a variety of — unduly complex — devices
> for this in Lojban but they are rarely used since
> there seems to be a perverse pleasure in creating
> fuhivla horrors to obscure largely unneeded
> foreign words.

Are you sure foreign quotes are used less often than
fu'ivla? Have you taken any statistics from some corpus?
My impresion is the opposite, but I haven't done any
serious research on the matter either.

> > But other than their origin, once they are
> > created they work
> > in all respects in the same way. They both
> > differ from lujvo
> > in that they don't have an internal structure.
>
> Relevance? The current situation is clear, as are
> the difficulties that ensue from it. That does
> not speak to the desirability of changing that
> situation nor the excesses that have given rise
> to it.

What difficulties? What excesses?

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 149

Jorge Llamb?as scripsit:

> Similarly, for the name "Enrique", we can choose between
> for example {la .enrikes.} or {la'o sy Enrique sy}, but
> making a fu'ivla {.enrike} would be silly.

I don't see that .enrike is so much more ludicrous than merko,
particularly if it has a somewhat broader sense than "x1 is an
Enrique". It would not be absurd to use the fu'ivla _stelere_
'Steller's' in the tanru stelere blacpi, stelere datka,
stelere xasybakni. Similarly, the straightforward way to say
"Gibbs free energy" is probably me la gibz. -free nejni, but
using gibizi isn't silly. (I don't know how to say "available"
in Lojban, short of gubni.)

> "They are fraid of even his horses" is more like
> {la prenu poi lo ke'a xirma ji'a sai cu se terpa},
> i.e. not one who is afraid of horses but one whose
> horses even cause fear in others.

It would be nice to see an interlinear gloss of the original
Lakota name.

--
John Cowan cowan@ccil.org www.reutershealth.com www.ccil.org/~cowan
Reversing the apostolic precept to be all things to all men, I usually before
Darwin
defended the tenability of the received doctrines, when I had to do
with the evolutionists; and stood up for the possibility of evolution among
the orthodox — thereby, no doubt, increasing an already current, but quite
undeserved, reputation for needless combativeness. --T. H. Huxley


posts: 1912


> Jorge Llambías wrote:
> >
> > cmevla and fu'ivla are not foreign words in Lojban.
>
> In one sense, correct, which is why they should be made carefully.

Yes, of course.

> Adhockery should suffice for communication but not be considered a
> full-fledged part of the language, and we don't have enough usage to
> warrant going beyond adhockery in most cases of cmevla and fu'ivla.

Not sure what you mean by that.

> On the other hand, in another sense they are still foreign words in
> Lojban, in the same sense that "deja vu", "esprit de corps", "ad hoc"
> and other borrowings aren't really English. They haven't really lost
> their foreign quality, and indeed are often italicized in English to
> emphasize this.

The way English deals with borrowings is different from the way
Spanish deals with borrowings. And they both differ with respect
to how Lojban deals with borrowings. For me at least, type-IV
fu'ivla have lost all their foreignness, at least to the same
degree as gismu.

> The corresponding process for Lojban would be when a fu'ivla is
> sufficiently well-defined so as to have a non-trivial multi-place place
> structure - something whose meaning goes beyond the process which
> created it. Then and only then can we clearly say that a fu'ivla is as
> much a Lojban word as a gismu, since gismu were in fact designed to have
> meaning only loosely tied to the specific denotation of the words from
> which they were derived.

Of course.

Most existing fu'ivla have clear (and trivial) place structure
"x1 is a XXX of type/species x2".

> > fu'ivla are in most respects like gismu, since gismu too
> > were based on foreign words for their creation.
>
> I disagree because of the lack of care in defining them as predicates,
> and the lack of consideration for their being useful in the ways that
> gismu are designed to be useful.

That's more a characteristic of the word creator than of the word
itself. I don't think you can say all fu'ivla are carelessly defined
or without consideration for their usefulness.

> fu'ivla tend to retain the grammatical nature of the words that they
> were borrowed from.

Nouns, you mean? Adjectives and verbs are much more rarely
borrowed than nouns, of course. {fu'ivla} tend to aquire the
place structure of similar existing words. For example, the
place structure of animals and plants is very regular in
gismu, if we except two or three weird cases.

> If the word coiner bothers to make a place
> structure for the fu'ivla, it will be because he can copy a pattern of a
> similar word, e.g. x1 is a XXXX of species/breed x2.

Yes. That's one of the most frequent fu'ivla place structures.

> > But other than their origin, once they are created they work
> > in all respects in the same way. They both differ from lujvo
> > in that they don't have an internal structure.
>
> But gismu differ because they have real place structures, in which
> considerable time was invested to make them useful, and it was intended
> that they be usable in multiple ways. fu'ivla that are coined ad hoc
> simply to fill a need don't get that investment, and words coined
> systematically like the animal and plant species are have only slightly
> more thought put into them.

And so? {krokodilo} works just like {cinfo}, whatever the processes
that led to their creation.

> By making a word from 6 (not 5) source languages, we commit to the word
> having a meaning independent of any one language. A fu'ivla that is not
> similarly made doesn't have that effort at independent meaning.

{fu'ivla} don't need to borrow the exact same meaning of the word
they are based on either.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 1912


> Jorge Llamb?as scripsit:
> > Similarly, for the name "Enrique", we can choose between
> > for example {la .enrikes.} or {la'o sy Enrique sy}, but
> > making a fu'ivla {.enrike} would be silly.
>
> I don't see that .enrike is so much more ludicrous than merko,
> particularly if it has a somewhat broader sense than "x1 is an
> Enrique".

If it meant "x1 is Enriquean", it would be closer to {merko}.
But such adjectives are more often based on people's last name,
there are too many Enriques with different characteristics
for Enriquean to be very meaningful as a general word.

On the other hand, I would favour merging CMENE with BRIVLA
so that cmene can be used as predicates, but I would retain the
distinction of cmevla (words ending in consonant) for individual
contentless names.

> It would not be absurd to use the fu'ivla _stelere_
> 'Steller's' in the tanru stelere blacpi, stelere datka,
> stelere xasybakni.

Agreed. But it would mean "x1 is Steller's in aspect x2" or
some such, not "x1 is the person named Steller".

> Similarly, the straightforward way to say
> "Gibbs free energy" is probably me la gibz. -free nejni, but
> using gibizi isn't silly.

That's {gi bi zi} though. {gibzi} would be an experimental gismu.

>(I don't know how to say "available"
> in Lojban, short of gubni.)

{gubni} seems to work. Or {selpi'oka'e}, "usable".

mu'o mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 1912


> >(I don't know how to say "available"
> > in Lojban, short of gubni.)
>
> {gubni} seems to work. Or {selpi'oka'e}, "usable".

Err, I meant {selplika'e}. {selpi'oka'e} is some
capability with respect to pianos, not sure exactly
what.

mu'o mi'e xorxes






__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 1912


I have written an informal description of the PEG morphology
algorithm (especially for Pierre and Nora that wanted to see
something more than the bare formal grammar rules). You can
read it starting from here:
<http://www.lojban.org/tiki/tiki-index.php?page=Informal+description+of+the+PEG+morphology+algorithm>
Informal description of the PEG morphology algorithm

If some parts need more clarification, please tell me.

(BTW, I have removed cmene-rafsi and cmavo-rafsi from the
morphology, given the underwhelming reception they got.
I still kept Pierre's fu'ivla rafsi and my general brivla
rafsi however, because I think they are useful and blend
much better with the rest. With the removal of cmavo-rafsi
I now allow V'y and y'V clusters in cmavo forms, which are
explicitly mentioned in CLL.)

I'm quite satisfied with the permissible consonant cluster
restrictions as implemented. I'm not yet very happy with
medial clusters, and I still have my doubts about vowel
clusters in general.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 162

Jorge Llambas wrote:
> Whatever brokenness are you two talking about?
>
> Neither type III, nor type IV fu'ivla break anything in
> Lojban with respect to ambiguity in parsing of the speech
> stream. All we've been discussing is how permissive we
> should be with respect to consonant clusters and consonant
> vowels in lojbanizing words, but there is no question of
> any ambiguities arising in any case.

If there were no ambiguities in the rules, there would be no need to
discuss. If there are ambiguities in the rules, then it cannot be said
that any particular word follows the rules, unless it follows the rules
under any and all possible interpretations.

>>We already have that in the form of la'o quotes. That was always
>>intended to be the primary mode of ad-hoc borrowing, but no one ever
>>used it, preferring to make Type IIIs, and in Pierre's case Type IVs
>>rampantly, and without especial regard for whether they broke the
>>restrictions that were built into the morphology based on algorithmic
>>needs.
>
> Nonsense. Pierre's fu'ivla follow CLL's rules very carefully and don't
> break any restrictions imposed on the morphology.

There's more to the language than morphology, and thus Pierre's fu'ivla
do not follow the rules at all.

p62 "Stage IV fu'ivla ... are used where a fu'ivla has become so common
or so important that it must be made as short as possible."

I've seen no evidence based on frequency or importance that there are
yet any stage IV fu'ivla needed in the language at all. If there are,
then they would be cultural words for people in the Lojban community,
like Finnish for Veijo and Bulgarian for Ivan (who has I believe
expressed a strong preference for using la'o so probably not him).

>>Usage decided against what you and I might prefer (I'll admit
>>that I did not use la'o much either unless the borrowing was from a
>>language/word whose phonology would be mangled by Lojbanizing it.)
>
> There is hardly any usage of fu'ivla so I don't know what you
> base this conclusion of what usage has decided on.

I'm using the broad sense of "usage" to include word-coining by Pierre,
and discussions about the language. If there has been little usage, you
confirm my statement that there should be no stage IV fu'ivla.

>> > That is, demarcated
>>
>>>items that are not looked at internally but just
>>>taken as a lump, with (virtually) anything
>>>allowed within the demarcations.
>>
>>Precisely what la'o does. No changes needed, just a willingness to use
>>what is built into the language.
>
> {la'o} is for names though, not for general words.

mela'o if you need it as a brivla, unless "me" has ceased to be useful
for the conversion of sumti to selbri. Most of the time, these things
seem to be used as sumti (exception being the culture words which are
often modifiers in tanru) and hence can remain as la'o.

>> More time is
>>spent making fu'ivla than actually using them, I suspect.
>
> Indeed. And that's not a bad thing, if you think about it.

I disagree. I can think of no language which adds borrowings except as
an ad hoc necessity when an internal word isn't obvious. Borrowing
should first of all be ad hoc. I'd go for systematic making culture
words for people in the community on the expectation that they would
want to use them, based on the culture words arguments. I'm not hot on
adding the plants and animal words at this stage because we don't have
any usage criteria on when to start and stop, and there are millions of
species. It becomes arbitrary which ones to add, and it stunts the
internal growth of the language by discouraging the creation of lujvo
for animals and plants.

> That means fu'ivla are being created in the right categories
> (such as animal names) where it is useful to have complete
> lists.

Why is it "useful" when there is admittedly no "usage"?

>People don't tend to create them on-the-fly.

They should be using la'o, and they wouldn't have to - they wouldn't
even need to know the Linnean name.

And if they want to make an ad hoc word, the guidelines say to make Type
III.

lojbab




posts: 162

Jorge Llambas wrote:
>>In
>>both ways they lower the value of Lojban below
>>what it might have if it simply allowed foreign
>>words — marked as such — to be used freely.
>
> Not in my book. Lojban with embedded la'o and zoi
> quoted strings when the quoted stuff is not relevantly
> foreign looks decidedly ugly to me.

Then make lujvo.

It's supposed to be ugly, so as to encourage people to make native
words. Otherwise we might as well call the language Anglan or maybe
Linneanlan because we are just encoding the native words of some other
language.

>>Relevance? The current situation is clear, as are
>>the difficulties that ensue from it. That does
>>not speak to the desirability of changing that
>>situation nor the excesses that have given rise
>>to it.
>
> What difficulties? What excesses?

Making Type IVs where there is no usage.

lojbab




posts: 1912


> Jorge Llambías wrote:
> > Whatever brokenness are you two talking about?
> >
> > Neither type III, nor type IV fu'ivla break anything in
> > Lojban with respect to ambiguity in parsing of the speech
> > stream. All we've been discussing is how permissive we
> > should be with respect to consonant clusters and consonant
> > vowels in lojbanizing words, but there is no question of
> > any ambiguities arising in any case.
>
> If there were no ambiguities in the rules, there would be no need to
> discuss.

There are several options among unambiguous sets of rules, so there
is a need to discuss.

For example, whether or not the vowel cluster "aa" is allowed in
fu'ivla, the rules will be unambiguous. Should we allow it?
(Currently, the PEG morphology does allow it.)

> If there are ambiguities in the rules, then it cannot be said
> that any particular word follows the rules, unless it follows the rules
> under any and all possible interpretations.

pc seems to think, and you seemed to agree, that there was some
problem with proving the unambiguity of the algorithm. There is
no such problem. The only issues at stake are about how permissive
the rules should be, but in no case does that affect the
unambiguity of the algorithm.

> There's more to the language than morphology, and thus Pierre's fu'ivla
> do not follow the rules at all.

We are basically discussing morphology here.

> p62 "Stage IV fu'ivla ... are used where a fu'ivla has become so common
> or so important that it must be made as short as possible."
>
> I've seen no evidence based on frequency or importance that there are
> yet any stage IV fu'ivla needed in the language at all. If there are,
> then they would be cultural words for people in the Lojban community,
> like Finnish for Veijo and Bulgarian for Ivan (who has I believe
> expressed a strong preference for using la'o so probably not him).

We still have to decide on a morphology algorithm. The preference
or dispreference of fu'ivla over other possibilities is a totally
separate issue.

> > {la'o} is for names though, not for general words.
>
> mela'o if you need it as a brivla, unless "me" has ceased to be useful
> for the conversion of sumti to selbri.

You're missing my point. Suppose {bunre} didn't exist and for some
weird reason we wanted to borrow "brown" from English.
{me la'o gy brown gy} won't do at all. That means "x1 is/are the
one(s) named Brown", it does not at all mean "x1 is (of the color) brown".
For that we could use {me la'e zoi gy brown gy}, but not
{me la'o gy brown gy}.

> Most of the time, these things
> seem to be used as sumti (exception being the culture words which are
> often modifiers in tanru) and hence can remain as la'o.

{la'o} does not just give a sumti, it gives a *name*, with no
semantic content.

> Borrowing
> should first of all be ad hoc. I'd go for systematic making culture
> words for people in the community on the expectation that they would
> want to use them, based on the culture words arguments. I'm not hot on
> adding the plants and animal words at this stage because we don't have
> any usage criteria on when to start and stop, and there are millions of
> species. It becomes arbitrary which ones to add, and it stunts the
> internal growth of the language by discouraging the creation of lujvo
> for animals and plants.

That's all very well, but we still need an official morphology algorithm
if we want to make the claims that we usually make about Lojban. The
policy of when to make fu'ivla or not is a separate issue.

>
> > That means fu'ivla are being created in the right categories
> > (such as animal names) where it is useful to have complete
> > lists.
>
> Why is it "useful" when there is admittedly no "usage"?

Because when I need a word for an animal that I don't talk very
often about, I know where I can find a good suggestion, and I know
that at least some others will be using the same word.

> >People don't tend to create them on-the-fly.
>
> They should be using la'o, and they wouldn't have to - they wouldn't
> even need to know the Linnean name.

But if they are going to use English words, why don't they just speak
in English in the first place?

mu'o mi'e xorxes




__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 1912


> Jorge Llambías wrote:
> > Lojban with embedded la'o and zoi
> > quoted strings when the quoted stuff is not relevantly
> > foreign looks decidedly ugly to me.
>
> Then make lujvo.

I do. I usually prefer lujvo over fu'ivla. We still need to settle
on the morphology rules for fu'ivla.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 162

Jorge Llambas wrote:
>>Adhockery should suffice for communication but not be considered a
>>full-fledged part of the language, and we don't have enough usage to
>>warrant going beyond adhockery in most cases of cmevla and fu'ivla.
>
> Not sure what you mean by that.

People should make up words as they need to use them. Only from
observing patterns of usage of made up words should there be a basis for
people to be coining type IV fu'ivla.

>>On the other hand, in another sense they are still foreign words in
>>Lojban, in the same sense that "deja vu", "esprit de corps", "ad hoc"
>>and other borrowings aren't really English. They haven't really lost
>>their foreign quality, and indeed are often italicized in English to
>>emphasize this.
>
> The way English deals with borrowings is different from the way
> Spanish deals with borrowings. And they both differ with respect
> to how Lojban deals with borrowings. For me at least, type-IV
> fu'ivla have lost all their foreignness, at least to the same
> degree as gismu.

Which is precisely why they shouldn't be being made. They ARE foreign
words because they aren't gismu or lujvo, which are the native word
forms. They aren't on the gismu list, and hence are not part of the
expected common vocabulary of the community. If they have a place
structure, it isn't officially documented and can only be defined based
on copying a pattern from a real Lojban word.

If they are as gismu, then the list of roots that must be commonly
learned to be minimally competent in the language grows with the size of
the "fu'ivla list" that doesn't actually exist because in fact no such
list has been created and even quasi-baselined. Anyone who uses a Type
IV fu'ivla thus has no reason to expect that he will be understood, and
in fact I almost never do understand writings that include them and
usually won't even try.

>>The corresponding process for Lojban would be when a fu'ivla is
>>sufficiently well-defined so as to have a non-trivial multi-place place
>>structure - something whose meaning goes beyond the process which
>>created it. Then and only then can we clearly say that a fu'ivla is as
>>much a Lojban word as a gismu, since gismu were in fact designed to have
>>meaning only loosely tied to the specific denotation of the words from
>>which they were derived.
>
> Of course.
>
> Most existing fu'ivla have clear (and trivial) place structure
> "x1 is a XXX of type/species x2".

which means that they are really just names. There is no semantics in
that place structure - it is just a category within a categorical
hierarchy. You could use that place structure for a plant, an animal, a
computer chip, or anything else that can be classified to at least two
levels (if not two levels then you only need one place). It meets the
minimal requirements of the language but it doesn't really *mean*
anything. If I see a word with that place structure, like your
krokodilo, I have no idea what a krokodilo is EXCEPT by recourse to my
knowledge of what the word means in the language it was borrowed from.

Which is why it is still a borrowing no matter what shape the word has
in Lojban. Knowing the place structure doesn't enlighten you any more
than knowing the foreign language word told you in the first place.

When it acquires a denotation that is specific to Lojban, connotations
that are internal to the language, a usage history, THEN it is a real
Lojban word. pavyseljirna are more real in Lojban than krokodilo even
though I've never seen one in real life, because the word has seen
in-language use. It means something (that which everyone seeks after
and never finds, perhaps because no one knows how to say they are
seeking one %^)

Does a krokodilo include or exclude a cayman or an alligator? Only if
one knows the intricacies of the biological classification system could
one know what the Linnean words being borrowed mean. And knowing that
krokodilo has some sort of Esperantic significance, though I don't know
what it is, does that significance also apply to Lojban? It will if an
Esperantist who only knows the word through that language learns the
word in Lojban.

>>>fu'ivla are in most respects like gismu, since gismu too
>>>were based on foreign words for their creation.
>>
>>I disagree because of the lack of care in defining them as predicates,
>>and the lack of consideration for their being useful in the ways that
>>gismu are designed to be useful.
>
> That's more a characteristic of the word creator than of the word
> itself. I don't think you can say all fu'ivla are carelessly defined
> or without consideration for their usefulness.

If they are created as Type IVs before they have been used as Types
I-III, then they have been made with lack of consideration, including
lack of consideration for the rest of us who have no basis on which to
decide if the words are worth learning. Even for the gismu that people
think are useless, there is the limited rationale that the set is
closed, so you aren't learning that many useless words.

>>If the word coiner bothers to make a place
>>structure for the fu'ivla, it will be because he can copy a pattern of a
>>similar word, e.g. x1 is a XXXX of species/breed x2.
>
> Yes. That's one of the most frequent fu'ivla place structures.

And it is meaningless as I said, apart from the meaning in the
borrowed-from language

>>But gismu differ because they have real place structures, in which
>>considerable time was invested to make them useful, and it was intended
>>that they be usable in multiple ways. fu'ivla that are coined ad hoc
>>simply to fill a need don't get that investment, and words coined
>>systematically like the animal and plant species are have only slightly
>>more thought put into them.
>
> And so? {krokodilo} works just like {cinfo}, whatever the processes
> that led to their creation.

No it doesn't, because by making cinfo a gismu, we *invited* it to be
loaded with internal Lojban meaning. (probably hasn't yet arisen, of
course).

Lojban will devise internally its concept of whether a "cmana cinfo"
makes any sense or whether the word should be mlatrpuma or whatever word
Pierre concocts, but making the word a gismu invites the possibility of
"cmana cinfo", whereas we aren't inclined to say "cmana me la'o ly.
panthera leo ly." because the correct form is la'o ly. puma concolor
ly. and those who have a clue about Linnean names know that the puma is
not a kind of lion.

Except in English and any other language that has made that tanru.

>>By making a word from 6 (not 5) source languages, we commit to the word
>>having a meaning independent of any one language. A fu'ivla that is not
>>similarly made doesn't have that effort at independent meaning.
>
> {fu'ivla} don't need to borrow the exact same meaning of the word
> they are based on either.

Then they shouldn't be being made as fu'ivla, since the morphological
category of fu'ivla *presumes* that they are words for concepts things
that occur in some other language. (CLL page 61 first paragraph about
fu'ivla).

lojbab




posts: 1912


> Anyone who uses a Type
> IV fu'ivla thus has no reason to expect that he will be understood, and
> in fact I almost never do understand writings that include them and
> usually won't even try.

Have you tried reading Robin's {la nicte cadzu}? I think he has
used one fu'ivla so far, in something like 11000 words.

> If I see a word with that place structure, like your
> krokodilo, I have no idea what a krokodilo is EXCEPT by recourse to my
> knowledge of what the word means in the language it was borrowed from.

You can always ask the speaker or someone else to explain, too. That's
how one learns a language.

> Does a krokodilo include or exclude a cayman or an alligator?

I wouldn't know, and it wouldn't much matter to me because I probably
couldn't tell them appart.

> Only if
> one knows the intricacies of the biological classification system could
> one know what the Linnean words being borrowed mean. And knowing that
> krokodilo has some sort of Esperantic significance, though I don't know
> what it is,

krokodili is what Esperantists do when they speak among themselves
in a language other than Esperanto. Bringing that to lojban, we are
now crocodiling, because we're using English instead of Lojban.

> does that significance also apply to Lojban?

I doubt it. Even metaphorically, the place structure would be wrong.

> It will if an
> Esperantist who only knows the word through that language learns the
> word in Lojban.

I doubt it very much. I don't think Esperantists confuse the two
meanings that much.

> If they are created as Type IVs before they have been used as Types
> I-III, then they have been made with lack of consideration, including
> lack of consideration for the rest of us who have no basis on which to
> decide if the words are worth learning.

Speak for yourself. As for me, I am very greateful for the taxonomy lists.

> > And so? {krokodilo} works just like {cinfo}, whatever the processes
> > that led to their creation.
>
> No it doesn't, because by making cinfo a gismu, we *invited* it to be
> loaded with internal Lojban meaning. (probably hasn't yet arisen, of
> course).

{krokodilo} is equally open to aquiring internal Lojban meaning.

> Lojban will devise internally its concept of whether a "cmana cinfo"
> makes any sense or whether the word should be mlatrpuma or whatever word
> Pierre concocts, but making the word a gismu invites the possibility of
> "cmana cinfo", whereas we aren't inclined to say "cmana me la'o ly.
> panthera leo ly." because the correct form is la'o ly. puma concolor
> ly. and those who have a clue about Linnean names know that the puma is
> not a kind of lion.

The same applies to {krokodilo} and related animals.

mu'o mi'e xorxes



__
Do you Yahoo!?
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com


posts: 162

Jorge Llambas wrote:
> --- Bob LeChevalier wrote:
>
>>Jorge Llambas wrote:
>>
>>>Whatever brokenness are you two talking about?
>>>
>>>Neither type III, nor type IV fu'ivla break anything in
>>>Lojban with respect to ambiguity in parsing of the speech
>>>stream. All we've been discussing is how permissive we
>>>should be with respect to consonant clusters and consonant
>>>vowels in lojbanizing words, but there is no question of
>>>any ambiguities arising in any case.
>>
>>If there were no ambiguities in the rules, there would be no need to
>>discuss.
>
> There are several options among unambiguous sets of rules, so there
> is a need to discuss.

There is one language, so having several options makes that one language
ambiguous. That one language has no one morphology algorithm because it
has no one set of rules and by Lojbanic standards, that is a "problem".

> For example, whether or not the vowel cluster "aa" is allowed in
> fu'ivla, the rules will be unambiguous. Should we allow it?

It isn't in the set of vowel pairs on p33-35 of CLL, so I have no idea
why it should. It has no defined Lojbanic pronunciation, except by
inserting an apostrophe.

(In one of your posts you say you allow y'V and V,y in cmavo despite
those being explicitly limited to names in the section listing them.
Similar issue. The lists of permitted usages are the core of the
language. If we got sloppy in our wording of what is permitted in
fu'ivla and experimental cmavo, we should tighten up the wording, not
loosen the rules.

>>If there are ambiguities in the rules, then it cannot be said
>>that any particular word follows the rules, unless it follows the rules
>>under any and all possible interpretations.
>
> pc seems to think, and you seemed to agree, that there was some
> problem with proving the unambiguity of the algorithm.

I think both of us are using "problem" is the broader sense of being
something that is argued over as opposed to settled.

> There is
> no such problem. The only issues at stake are about how permissive
> the rules should be, but in no case does that affect the
> unambiguity of the algorithm.

How can one prove that?

>>There's more to the language than morphology, and thus Pierre's fu'ivla
>>do not follow the rules at all.
>
> We are basically discussing morphology here.

Morphological decisions shouldn't be made in isolation based merely on
whether something can "work" or not. The morphology is a piece of the
language design and should be consistent with the rest. The
morphological forms embedded a philosophy of language, perhaps one not
carefully specified and perhaps one that people disagree with, but it
still exists. Changing a piece of the language design without
considering the role of that piece in the whole language will lead to
bad design.

>>p62 "Stage IV fu'ivla ... are used where a fu'ivla has become so common
>>or so important that it must be made as short as possible."
>>
>>I've seen no evidence based on frequency or importance that there are
>>yet any stage IV fu'ivla needed in the language at all. If there are,
>>then they would be cultural words for people in the Lojban community,
>>like Finnish for Veijo and Bulgarian for Ivan (who has I believe
>>expressed a strong preference for using la'o so probably not him).
>
> We still have to decide on a morphology algorithm. The preference
> or dispreference of fu'ivla over other possibilities is a totally
> separate issue.

But words made in violation of that rule on p62 are not proper Lojban
words, no matter how perfect their morphology is. pc and I are arguing
what the language should be, not what the morphology should be. The
morphology should be slave to the language design, and not vice versa.

>>>{la'o} is for names though, not for general words.
>>
>>mela'o if you need it as a brivla, unless "me" has ceased to be useful
>>for the conversion of sumti to selbri.
>
> You're missing my point. Suppose {bunre} didn't exist and for some
> weird reason we wanted to borrow "brown" from English.
> {me la'o gy brown gy} won't do at all. That means "x1 is/are the
> one(s) named Brown", it does not at all mean "x1 is (of the color) brown".
> For that we could use {me la'e zoi gy brown gy}, but not
> {me la'o gy brown gy}.

Which is why we added in Type III fu'ivla, which has a purpose to
specifically answer that problem, as stated at the bottom of page 61.

If I want to borrow a color word, I immediately make it Type III -
skarnmauve, skarnturkoise because I know that color words are especially
likely to run into that sort of ambiguity in meaning.

Meanwhile, type IV fu'ivla reintroduce the same problem, because without
the classifier, no one knows what a krokodilo is any more than they
would know what a la'o gy crocodile gy is. It might be the name of that
Australian character in the movie Crocodile Dundee.

??krokodilo: x1 is a corny Australian macho movie character of
species/breed x2

>>Most of the time, these things
>>seem to be used as sumti (exception being the culture words which are
>>often modifiers in tanru) and hence can remain as la'o.
>
> {la'o} does not just give a sumti, it gives a *name*, with no
> semantic content.

Linnean names are "names" and thus appropriately quoted with la'o (and
indeed this is explicitly stated in the section on la'o. Color names
are also names. Just because they aren't proper nouns, doesn't make
them non-names.

All borrowings give "names" with no semantic content INTERNAL to the
language. Thus Type IV fu'ivla should be reserved for words that have
achieved enough usage that an internal meaning might have developed.

>> Borrowing
>>should first of all be ad hoc. I'd go for systematic making culture
>>words for people in the community on the expectation that they would
>>want to use them, based on the culture words arguments. I'm not hot on
>>adding the plants and animal words at this stage because we don't have
>>any usage criteria on when to start and stop, and there are millions of
>>species. It becomes arbitrary which ones to add, and it stunts the
>>internal growth of the language by discouraging the creation of lujvo
>>for animals and plants.
>
> That's all very well, but we still need an official morphology algorithm
> if we want to make the claims that we usually make about Lojban. The
> policy of when to make fu'ivla or not is a separate issue.

The morphology should encourage adhering to the policy.

>> >People don't tend to create them on-the-fly.
>>
>>They should be using la'o, and they wouldn't have to - they wouldn't
>>even need to know the Linnean name.
>
> But if they are going to use English words, why don't they just speak
> in English in the first place?

If what they are saying uses that many words that are English, then
indeed they might as well put the whole thing in zoi quotes, or stick to
English.

lojbab




posts: 149

Jorge Llamb?as scripsit:

> For example, whether or not the vowel cluster "aa" is allowed in
> fu'ivla, the rules will be unambiguous. Should we allow it?
> (Currently, the PEG morphology does allow it.)

I haven't weighed in before thanks to work problems followed by health
problems, hopefully now both O.K. for a while.

On the matter of diphthongs, I feel very strongly that:

1) Lojban has precisely sixteen diphthongs, ai ei oi au ia ie ii io iu
ua ue ui uo uu iy uy. Any words containing other diphthongs are errors
and should be corrected.

2) The first four are freely usable in all word types except gismu,
where they are excluded by construction.

3) The last two are freely usable in cmevla, and may be used in other
word types if needed for some morphological purpose, but should not be
freely usable.

I am undecided about the appropriate restrictions on the remaining ten
diphthongs (iV and uV). The safest restriction (that is, the least
threatening to pronounceability) would be to restrict them to initial
position only. I do not know how many attested cmevla and fu'ivla this
would outlaw, and would like to find out; if the number is not too large,
I would favor it.

--
John Cowan <cowan@ccil.org>
http://www.reutershealth.com http://www.ccil.org/~cowan
.e'osai ko sarji la lojban.
Please support Lojban! http://www.lojban.org


posts: 149

Jorge Llamb?as scripsit:

> > It would not be absurd to use the fu'ivla _stelere_
> > 'Steller's' in the tanru stelere blacpi, stelere datka,
> > stelere xasybakni.
>
> Agreed. But it would mean "x1 is Steller's in aspect x2" or
> some such, not "x1 is the person named Steller".

The point is (I think this is right) that the Steller of Steller's
sea cow is a different person from the Steller of Steller's eider
and Steller's jay, so it really is only the name they have in common.

If the example is wrong, at any rate "Rosta orthography" and "Rosta
tensioners" are invented by different Rostas.

> > Similarly, the straightforward way to say
> > "Gibbs free energy" is probably me la gibz. -free nejni, but
> > using gibizi isn't silly.
>
> That's {gi bi zi} though. {gibzi} would be an experimental gismu.

Boy howdy, my sense of the morphology is degraded these days.

--
Business before pleasure, if not too bloomering long before.
--Nicholas van Rijn
John Cowan <cowan@ccil.org>
http://www.ccil.org/~cowan http://www.reutershealth.com


posts: 1912



> > For example, whether or not the vowel cluster "aa" is allowed in
> > fu'ivla, the rules will be unambiguous. Should we allow it?
>
> It isn't in the set of vowel pairs on p33-35 of CLL, so I have no idea
> why it should. It has no defined Lojbanic pronunciation, except by
> inserting an apostrophe.

OK, then you would make the pairs on p33-35 global restrictions?
No cmene or fu'ivla with them? I would be quite fine with that.
But then CLL has {kulnrkorea} on p64. If we allowe {ea} in
fu'ivla, why not {aa}?

> (In one of your posts you say you allow y'V and V,y in cmavo despite
> those being explicitly limited to names in the section listing them.

p35: 'Vowel pairs involving "y" appear only in Lojbanized names.
They could appear in cmavo (structure words), but only ".y'y." is
so used.'

What does "they could appear in cmavo" mean? I take it to mean that
something like {la'y} is morphologically a cmavo but that no actual
cmavo uses that form.

> Similar issue. The lists of permitted usages are the core of the
> language. If we got sloppy in our wording of what is permitted in
> fu'ivla and experimental cmavo, we should tighten up the wording, not
> loosen the rules.

I would be very happy with that. In fact, tightening the rules
is exactly what I'm proposing.

> > pc seems to think, and you seemed to agree, that there was some
> > problem with proving the unambiguity of the algorithm.
>
> I think both of us are using "problem" is the broader sense of being
> something that is argued over as opposed to settled.

I think he had something else in mind, but I'm sure he can
explain himself.

> > The only issues at stake are about how permissive
> > the rules should be, but in no case does that affect the
> > unambiguity of the algorithm.
>
> How can one prove that?

PEG grammars are unambiguous by construction, so whatever we
settle on, we know it will be unambiguous.

> Morphological decisions shouldn't be made in isolation based merely on
> whether something can "work" or not. The morphology is a piece of the
> language design and should be consistent with the rest.

Indeed! Inconsistent rules (like forbidding the "mz" cluster)
that have no possible reasonable justification are really ugly.

> >>>{la'o} is for names though, not for general words.
> >>
> >>mela'o if you need it as a brivla, unless "me" has ceased to be useful
> >>for the conversion of sumti to selbri.
> >
> > You're missing my point. Suppose {bunre} didn't exist and for some
> > weird reason we wanted to borrow "brown" from English.
> > {me la'o gy brown gy} won't do at all. That means "x1 is/are the
> > one(s) named Brown", it does not at all mean "x1 is (of the color) brown".
> > For that we could use {me la'e zoi gy brown gy}, but not
> > {me la'o gy brown gy}.
>
> Which is why we added in Type III fu'ivla, which has a purpose to
> specifically answer that problem, as stated at the bottom of page 61.

You are still missing the point, it seems to me. {me la'o gy brown gy}
can never mean "x1 is brown in color", not because of any polysemy
of "brown" but because la'o is for names.

> If I want to borrow a color word, I immediately make it Type III -
> skarnmauve, skarnturkoise because I know that color words are especially
> likely to run into that sort of ambiguity in meaning.

That's fine. {turkoise} could also work as a fu'ivla for that
(although I would consider basing a borrowing for a color on an English
word very bad form). But {me la'o gy turquoise gy} could never
mean "x1 is of color turquoise", because it means "x1 is/are the one(s)
named Turquoise".

> Meanwhile, type IV fu'ivla reintroduce the same problem, because without
> the classifier, no one knows what a krokodilo is any more than they
> would know what a la'o gy crocodile gy is. It might be the name of that
> Australian character in the movie Crocodile Dundee.
>
> ??krokodilo: x1 is a corny Australian macho movie character of
> species/breed x2

In my experience lojbanists have more common sense than that, but
yes, in principle it could mean anything. {skarnturkoise} could
also be the name of some strange foodstuff for all I know, because
rafsi classifiers are only helpful hints, not definitory.

> > {la'o} does not just give a sumti, it gives a *name*, with no
> > semantic content.
>
> Linnean names are "names" and thus appropriately quoted with la'o (and
> indeed this is explicitly stated in the section on la'o.

Yes, but then you need to say {danlu be la'o ...}, not just
{me la'o ...}.

> Color names
> are also names. Just because they aren't proper nouns, doesn't make
> them non-names.

But we don't want to say "x1 is/are the one(s) named Turquoise", we
want to say that x1 has a certain color. We don't want the named
abstraction in x1.

> All borrowings give "names" with no semantic content INTERNAL to the
> language. Thus Type IV fu'ivla should be reserved for words that have
> achieved enough usage that an internal meaning might have developed.

That sounds like a catch-22.

mu'o mi'e xorxes




__
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com




posts: 1912


> On the matter of diphthongs, I feel very strongly that:
>
> 1) Lojban has precisely sixteen diphthongs, ai ei oi au ia ie ii io iu
> ua ue ui uo uu iy uy. Any words containing other diphthongs are errors
> and should be corrected.

By "diphthong" you mean any two vowels in a row? "ae" is not
a diphthong in Spanish, it is two syllables. If allowed in Lojban
words, it would also be two syllables, I don't think anyone disputes
that.

> 2) The first four are freely usable in all word types except gismu,
> where they are excluded by construction.

Yes. But are they usable next to each other? Are {.aiais.}, {praiai},
{laiai} morphologically acceptable words?

> 3) The last two are freely usable in cmevla, and may be used in other
> word types if needed for some morphological purpose, but should not be
> freely usable.

"freely usable" includes things like {.iaiys.}? {.aiys.}?

> I am undecided about the appropriate restrictions on the remaining ten
> diphthongs (iV and uV). The safest restriction (that is, the least
> threatening to pronounceability) would be to restrict them to initial
> position only. I do not know how many attested cmevla and fu'ivla this
> would outlaw, and would like to find out; if the number is not too large,
> I would favor it.

{niuiork} would be one. Then things like {kolombias}. We would have
to require things like {nu'i'ork} and {kolombi'as}. There are a few
fu'ivla that use them too, but not many. In jbovlaste I find
only {mianma}, {nargrkaria}.

{tropaiolo} and {smacrkobaiu} don't really have them because they
are {ai,o} and {ai,u}, but we need to decide whether we allow these
diphthong+vowel combinations too.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 2388


wrote:

>
> --- Bob LeChevalier wrote:
> > Jorge Llambías wrote:
> > > Whatever brokenness are you two talking
> about?
> > >
> > > Neither type III, nor type IV fu'ivla break
> anything in
> > > Lojban with respect to ambiguity in parsing
> of the speech
> > > stream. All we've been discussing is how
> permissive we
> > > should be with respect to consonant
> clusters and consonant
> > > vowels in lojbanizing words, but there is
> no question of
> > > any ambiguities arising in any case.
> >
> > If there were no ambiguities in the rules,
> there would be no need to
> > discuss.
>
> There are several options among unambiguous
> sets of rules, so there
> is a need to discuss.
>
> For example, whether or not the vowel cluster
> "aa" is allowed in
> fu'ivla, the rules will be unambiguous. Should
> we allow it?
> (Currently, the PEG morphology does allow it.)

> > If there are ambiguities in the rules, then
> it cannot be said
> > that any particular word follows the rules,
> unless it follows the rules
> > under any and all possible interpretations.
>
> pc seems to think, and you seemed to agree,
> that there was some
> problem with proving the unambiguity of the
> algorithm. There is
> no such problem. The only issues at stake are
> about how permissive
> the rules should be, but in no case does that
> affect the
> unambiguity of the algorithm.


Since there is admittedly no algorithm at the
moment, there is of course no ambiguity in the
algorithm. But there is no algorithm and, after
more than 500 pages on these issues,no clear
lielihood there will be one. This looks to be a
problem and one that seems on inspection to be
rooted in the (pointless since they are all just
brivla) brouhaha about fuhivla and lujvo and, to
a lesser extent, in cmevla. How much easier to
say "anything goes" and leave it to nature to
provide the parameters — on tthose rare
occasions when something is actually needed.

> > There's more to the language than morphology,
> and thus Pierre's fu'ivla
> > do not follow the rules at all.
>
> We are basically discussing morphology here.
>
> > p62 "Stage IV fu'ivla ... are used where a
> fu'ivla has become so common
> > or so important that it must be made as short
> as possible."
> >
> > I've seen no evidence based on frequency or
> importance that there are
> > yet any stage IV fu'ivla needed in the
> language at all. If there are,
> > then they would be cultural words for people
> in the Lojban community,
> > like Finnish for Veijo and Bulgarian for Ivan
> (who has I believe
> > expressed a strong preference for using la'o
> so probably not him).
>
> We still have to decide on a morphology
> algorithm. The preference
> or dispreference of fu'ivla over other
> possibilities is a totally
> separate issue.

Except that the muckery of deciding what to allow
in fuhivla complicates a fairly simple situation
by trying to find the right procrustean bed for
all words foreign, rather than just taking them
as they come.

> > > {la'o} is for names though, not for general
> words.
> >
> > mela'o if you need it as a brivla, unless
> "me" has ceased to be useful
> > for the conversion of sumti to selbri.
>
> You're missing my point. Suppose {bunre} didn't
> exist and for some
> weird reason we wanted to borrow "brown" from
> English.
> {me la'o gy brown gy} won't do at all. That
> means "x1 is/are the
> one(s) named Brown", it does not at all mean
> "x1 is (of the color) brown".
> For that we could use {me la'e zoi gy brown
> gy}, but not
> {me la'o gy brown gy}.
I suspect that there are — and surely could
easily be — simpler versions of this. But even
if there are not, the rarity of the forms makes
the length appropriate (and encourages the
continuing rarity).

> > Most of the time, these things
> > seem to be used as sumti (exception being the
> culture words which are
> > often modifiers in tanru) and hence can
> remain as la'o.
>
> {la'o} does not just give a sumti, it gives a
> *name*, with no
> semantic content.
>
> > Borrowing
> > should first of all be ad hoc. I'd go for
> systematic making culture
> > words for people in the community on the
> expectation that they would
> > want to use them, based on the culture words
> arguments. I'm not hot on
> > adding the plants and animal words at this
> stage because we don't have
> > any usage criteria on when to start and stop,
> and there are millions of
> > species. It becomes arbitrary which ones to
> add, and it stunts the
> > internal growth of the language by
> discouraging the creation of lujvo
> > for animals and plants.
>
> That's all very well, but we still need an
> official morphology algorithm
> if we want to make the claims that we usually
> make about Lojban. The
> policy of when to make fu'ivla or not is a
> separate issue.

As noted, allowing full borrowings wwell marked
simplifies the algorithm enormously, since it
gives blocks of "whatever."

> >
> > > That means fu'ivla are being created in the
> right categories
> > > (such as animal names) where it is useful
> to have complete
> > > lists.
> >
> > Why is it "useful" when there is admittedly
> no "usage"?
>
> Because when I need a word for an animal that I
> don't talk very
> often about, I know where I can find a good
> suggestion, and I know
> that at least some others will be using the
> same word.

How do you know the latter unless the word is
already an established usage, in which case it is
not your wisest option to make it up? Making
fuhivla is remarkably random (even when you end
up with a legitimate case); direct borrowing is
safer.

> > >People don't tend to create them
> on-the-fly.
> >
> > They should be using la'o, and they wouldn't
> have to - they wouldn't
> > even need to know the Linnean name.
>
> But if they are going to use English words, why
> don't they just speak
> in English in the first place?
>
Er, that was Lojbab's point more or less. If
they want to talk Lojban, then they don't go
borrowing from some other language, even in a
thoroughly screwed-up way. Early Loglan usage
(well, of the 1976 sort) was almost all Loglish
or even Englan and that does not seem to be a
situation we want to encourage nowadays (nor
then, but it was harder to fight then); indeed,
we want to actively discourage it.


posts: 1912


> Since there is admittedly no algorithm at the
> moment, there is of course no ambiguity in the
> algorithm.

There are several candidate unambiguous algorithms.
One of them you can find in the page this is a discussion
of. The rest can be easily constructed therefrom.

What we don't have is an *official* algorithm yet,
but plenty of candidates.

But there is no algorithm and, after
> more than 500 pages on these issues,no clear
> lielihood there will be one.

I'm fairly certain we will vote one in eventually,
in a couple of weeks or in a couple of months.

> This looks to be a
> problem and one that seems on inspection to be
> rooted in the (pointless since they are all just
> brivla) brouhaha about fuhivla and lujvo and, to
> a lesser extent, in cmevla.

The remaining significant issues are about what clusters
should be admissible, nothing harder than that.

> How much easier to
> say "anything goes" and leave it to nature to
> provide the parameters — on tthose rare
> occasions when something is actually needed.

That would be easier, but not quite Lojbanic.

mu'o mi'e xorxes




__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > Since there is admittedly no algorithm at the
> > moment, there is of course no ambiguity in
> the
> > algorithm.
>
> There are several candidate unambiguous
> algorithms.
> One of them you can find in the page this is a
> discussion
> of. The rest can be easily constructed
> therefrom.
>
> What we don't have is an *official* algorithm
> yet,
> but plenty of candidates.

As noted, candidates aren't elected and at the
present rate (the 500+ pages so far) the chances
of an election seem remote. The remaining issues
are mainly the ones that were there at the
beginning and are not only unresolved but without
any criteria for resolution — other than brute
force, of course. Which is the worst possible
way to do language design, though often what has
been used.

> But there is no algorithm and, after
> > more than 500 pages on these issues,no clear
> > lielihood there will be one.
>
> I'm fairly certain we will vote one in
> eventually,
> in a couple of weeks or in a couple of months.
>
> > This looks to be a
> > problem and one that seems on inspection to
> be
> > rooted in the (pointless since they are all
> just
> > brivla) brouhaha about fuhivla and lujvo and,
> to
> > a lesser extent, in cmevla.
>
> The remaining significant issues are about what
> clusters
> should be admissible, nothing harder than that.

Except, as noted, that these have been the issues
that have generated the previous 500 pages.

> > How much easier to
> > say "anything goes" and leave it to nature
> to
> > provide the parameters — on tthose rare
> > occasions when something is actually needed.
>
> That would be easier, but not quite Lojbanic.

Ah, you do get the point! ("Paralojbanic things
are dealt with in paralojbanic ways" in case you
really didn't.)


posts: 1912


> The remaining issues
> are mainly the ones that were there at the
> beginning and are not only unresolved but without
> any criteria for resolution — other than brute
> force, of course.

I'm a bit more optimistic.

I'm already quite satisfied with the proposed solution for
permissble initial clusters. Other suggestions are of course
always welcome.

I will be happy with any definiton on vowels that doesn't treat
cmene/fu'ivla/cmavo differently (except perhaps for 'y', which is
not permited in fu'ivla). I tend towards permissivenes here, but
I can live with any globally applied rule.

I will be unhappy with any definition on consonant medial clusters,
because we are unfortunately stuck with syllabic consonants (because
of the bloody type-III fu'ivla). But Pierre and I have worked out
something we can both live with and so I expect something along those
lines will be approved once Nora has had a chance to review it.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 149

Jorge Llamb?as scripsit:

> By "diphthong" you mean any two vowels in a row? "ae" is not
> a diphthong in Spanish, it is two syllables. If allowed in Lojban
> words, it would also be two syllables, I don't think anyone disputes
> that.

I do. I mean that pairs of vowel letters (paired from the left) must
be one of those sixteen only. It also is the case that all of these
are pronounced as vowel+glide or glide+vowel, but that is phonology
rather than morphology.

> Are {.aiais.}, {praiai}, {laiai} morphologically acceptable words?

Yes, since vowels pair from the left.

> > 3) The last two are freely usable in cmevla, and may be used in other
> > word types if needed for some morphological purpose, but should not be
> > freely usable.
>
> "freely usable" includes things like {.iaiys.}? {.aiys.}?

Yes.

> {niuiork} would be one. Then things like {kolombias}. We would have
> to require things like {nu'i'ork} and {kolombi'as}. There are a few
> fu'ivla that use them too, but not many. In jbovlaste I find
> only {mianma}, {nargrkaria}.

Most people who live here say "nu,iork" and "nu,iok" in varying mixtures;
I myself am not native, and say "nu,iork" always; "niu,iork" is a decidedly
foreign term. I take it also that "kolOmbi'as" is better than
"kolombi'as." But these are incidental.

I guess I have no problem with iV and uV preceded by a vowel (pair)
either: "cevni selmipri be fi le iaias. mesygri", e.g.

> {tropaiolo} and {smacrkobaiu} don't really have them because they
> are {ai,o} and {ai,u}, but we need to decide whether we allow these
> diphthong+vowel combinations too.

I think we do.

from another email in this thread

> OK, then you would make the pairs on p33-35 global restrictions?
> No cmene or fu'ivla with them? I would be quite fine with that.
> But then CLL has {kulnrkorea} on p64. If we allowe {ea} in
> fu'ivla, why not {aa}?

I would make them global restrictions, and now think kulnrkorea should
be kulnrkore'a.

> > (In one of your posts you say you allow y'V and V,y in cmavo despite
> > those being explicitly limited to names in the section listing them.
>
> p35: 'Vowel pairs involving "y" appear only in Lojbanized names.
> They could appear in cmavo (structure words), but only ".y'y." is
> so used.'
>
> What does "they could appear in cmavo" mean? I take it to mean that
> something like {la'y} is morphologically a cmavo but that no actual
> cmavo uses that form.

That's what I meant.

> Indeed! Inconsistent rules (like forbidding the "mz" cluster)
> that have no possible reasonable justification are really ugly.

The justification is empirical, not rational. You may say the
experimental technique was lousy (it was), but criticizing it on
the ground of insufficient rationality misses the target.

> You are still missing the point, it seems to me. {me la'o gy brown gy}
> can never mean "x1 is brown in color", not because of any polysemy
> of "brown" but because la'o is for names.

I agree: it means x1 is Brown, not that x1 is brown.

> > All borrowings give "names" with no semantic content INTERNAL to the
> > language. Thus Type IV fu'ivla should be reserved for words that have
> > achieved enough usage that an internal meaning might have developed.
>
> That sounds like a catch-22.

Not really. It's elliptical for "if a Type III has become popular
and is felt to be clunky, coin a Type IV as a convenient abbreviation
for it; otherwise, don't coin Type IVs.

--
John Cowan www.ccil.org/~cowan cowan@ccil.org www.reutershealth.com
Police in many lands are now complaining that local arrestees are insisting
on having their Miranda rights read to them, just like perps in American TV
cop shows. When it's explained to them that they are in a different country,
where those rights do not exist, they become outraged. --Neal Stephenson


posts: 1912


> Jorge Llamb?as scripsit:
> > By "diphthong" you mean any two vowels in a row? "ae" is not
> > a diphthong in Spanish, it is two syllables. If allowed in Lojban
> > words, it would also be two syllables, I don't think anyone disputes
> > that.
>
> I do. I mean that pairs of vowel letters (paired from the left) must
> be one of those sixteen only. It also is the case that all of these
> are pronounced as vowel+glide or glide+vowel, but that is phonology
> rather than morphology.

That's relevant to the morphology though, because the morphology
needs to count syllables.

Would you allow {iea} and {ieai}, given that each pair paired
from the left is permissible?

> > > 3) The last two are freely usable in cmevla, and may be used in other
> > > word types if needed for some morphological purpose, but should not be
> > > freely usable.
> >
> > "freely usable" includes things like {.iaiys.}? {.aiys.}?
>
> Yes.

What about {iya}? {iay}?

> > {niuiork} would be one. Then things like {kolombias}. We would have
> > to require things like {nu'i'ork} and {kolombi'as}. There are a few
> > fu'ivla that use them too, but not many. In jbovlaste I find
> > only {mianma}, {nargrkaria}.
>
> Most people who live here say "nu,iork" and "nu,iok" in varying mixtures;
> I myself am not native, and say "nu,iork" always;

But that would be pronoundced "Nwee-orck" in Lojban (commas are
indicative only, in this case they indicate incorrectly the
syllable break).

> "niu,iork" is a decidedly
> foreign term.

You could use {nuuiork} as a better approximation.

> I take it also that "kolOmbi'as" is better than
> "kolombi'as." But these are incidental.
>
> I guess I have no problem with iV and uV preceded by a vowel (pair)
> either: "cevni selmipri be fi le iaias. mesygri", e.g.

So your rule would be: any number of diphthongs in a row, possibly
with a final vowel, and the first diphthong can be glide+vowel
only initially. Is that right?


> from another email in this thread
>
> > Indeed! Inconsistent rules (like forbidding the "mz" cluster)
> > that have no possible reasonable justification are really ugly.
>
> The justification is empirical, not rational. You may say the
> experimental technique was lousy (it was), but criticizing it on
> the ground of insufficient rationality misses the target.

That was JCB's justification, but Lojban's is the unreasonable
"that's what JCB did". Lojban does not follow Loglan's morphology
in all the details, many things were changed, so not changing
that particular restriction was, IMO, unreasonable. That's the one
that sticks out the most among the global consonant restrictions.
The second worse is ntc/ndj/nts/ndz.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 162

Jorge Llambas wrote:
>>Which is why we added in Type III fu'ivla, which has a purpose to
>>specifically answer that problem, as stated at the bottom of page 61.
>
> You are still missing the point, it seems to me. {me la'o gy brown gy}
> can never mean "x1 is brown in color", not because of any polysemy
> of "brown" but because la'o is for names.

CLL says otherwise, with a reason...
It may be contradictory, but CLL explicitly says that la'o ... and la
.... are Type I and Type II borrowings. The reason why is that all
borrowings are ultimately names for concepts or things, and indeed all

  • words* are names for concepts or things.


> That's fine. {turkoise} could also work as a fu'ivla for that
> (although I would consider basing a borrowing for a color on an English
> word very bad form). But {me la'o gy turquoise gy} could never
> mean "x1 is of color turquoise", because it means "x1 is/are the one(s)
> named Turquoise".

All words ultimately are names - names for things, names for concepts.
(I believe And Rosta took an even stronger stance on this concept - I
think he wanted to entirely eliminate the distinction between names and
brivla). The name of the color turquoise is "turquoise" in English.
The real weakness of using a mela'o borrowing is that it has a fixed and
almost meaningless place structure that is applicable to all words that
are names. But most borrowings are used precisely according to that
place-structure - i.e. as "names".

>>Meanwhile, type IV fu'ivla reintroduce the same problem, because without
>>the classifier, no one knows what a krokodilo is any more than they
>>would know what a la'o gy crocodile gy is. It might be the name of that
>>Australian character in the movie Crocodile Dundee.
>>
>>??krokodilo: x1 is a corny Australian macho movie character of
>>species/breed x2
>
> In my experience lojbanists have more common sense than that, but
> yes, in principle it could mean anything. {skarnturkoise} could
> also be the name of some strange foodstuff for all I know, because
> rafsi classifiers are only helpful hints, not definitory.

They aren't definitory, but they shouldn't be misleading like that.

>>>{la'o} does not just give a sumti, it gives a *name*, with no
>>>semantic content.
>>
>>Linnean names are "names" and thus appropriately quoted with la'o (and
>>indeed this is explicitly stated in the section on la'o.
>
> Yes, but then you need to say {danlu be la'o ...}, not just
> {me la'o ...}.

Why? What else does me la'o mean.

>>Color names
>>are also names. Just because they aren't proper nouns, doesn't make
>>them non-names.
>
> But we don't want to say "x1 is/are the one(s) named Turquoise", we
> want to say that x1 has a certain color. We don't want the named
> abstraction in x1.

All things of the color turquoise can be "named" "turquoise".

lojbab




posts: 162

> By "diphthong" you mean any two vowels in a row? "ae" is not
> a diphthong in Spanish, it is two syllables. If allowed in Lojban
> words, it would also be two syllables, I don't think anyone disputes
> that.

Then you should be writing it "a,e" so we know what you are talking about.

There was an alternate orthography that allowed omission of the
close-commas for JCB fans, but so far as I know, no one has chosen to
use it. If they do, it should be an all-or-nothing things, not pick
which rules you want to use.

>>2) The first four are freely usable in all word types except gismu,
>>where they are excluded by construction.
>
> Yes. But are they usable next to each other? Are {.aiais.}, {praiai},
> {laiai} morphologically acceptable words?

They require close-commas, paired from the left, according to CLL.


>>I am undecided about the appropriate restrictions on the remaining ten
>>diphthongs (iV and uV). The safest restriction (that is, the least
>>threatening to pronounceability) would be to restrict them to initial
>>position only. I do not know how many attested cmevla and fu'ivla this
>>would outlaw, and would like to find out; if the number is not too large,
>>I would favor it.
>
>
> {niuiork} would be one. Then things like {kolombias}. We would have
> to require things like {nu'i'ork} and {kolombi'as}. There are a few

nu,iork and kolombi,as

> fu'ivla that use them too, but not many. In jbovlaste I find
> only {mianma}, {nargrkaria}.
>
> {tropaiolo} and {smacrkobaiu} don't really have them because they
> are {ai,o} and {ai,u}, but we need to decide whether we allow these
> diphthong+vowel combinations too.

close-comma-marked vowel-pairings have been explicitly permitted in
names since TLI Loglan days (and indeed I notice that we have examples
saying that any name with three or more vowels must be paired from the
left. I would prefer to require explicit marking so that someone
doesn't accidentally end up with a string that if paired from the left
gives a non-Lojban pair. Whether they are allowed elsewhere besides
names is, I think, subject to debate - I don't think we explicitly
permit it.

lojbab



posts: 162

John Cowan wrote:
>>You are still missing the point, it seems to me. {me la'o gy brown gy}
>>can never mean "x1 is brown in color", not because of any polysemy
>>of "brown" but because la'o is for names.
>
> I agree: it means x1 is Brown, not that x1 is brown.

What's the difference, really?

lojbab



posts: 162

Jorge Llambas wrote:

>>>{niuiork} would be one. Then things like {kolombias}. We would have
>>>to require things like {nu'i'ork} and {kolombi'as}. There are a few
>>>fu'ivla that use them too, but not many. In jbovlaste I find
>>>only {mianma}, {nargrkaria}.
>>
>>Most people who live here say "nu,iork" and "nu,iok" in varying mixtures;
>>I myself am not native, and say "nu,iork" always;
>
>
> But that would be pronoundced "Nwee-orck" in Lojban (commas are
> indicative only, in this case they indicate incorrectly the
> syllable break).

Close-commas force a syllable break. They cannot "incorrectly indicate"
a syllable break.

What you may be confusing is the rule that says that
na,iork.
and
nai,ork.

are considered the same word because a close-comma is not indicative of
a distinct word. But it IS indicative of intended pronunciation, all
the more so because *"nui,ork" should be invalid, while "nu,iork" is valid.


>>"niu,iork" is a decidedly
>>foreign term.
>
> You could use {nuuiork} as a better approximation.

Only if you allow uu diphthong after n, which John and I oppose.

>>The justification is empirical, not rational. You may say the
>>experimental technique was lousy (it was), but criticizing it on
>>the ground of insufficient rationality misses the target.
>
> That was JCB's justification, but Lojban's is the unreasonable
> "that's what JCB did". Lojban does not follow Loglan's morphology
> in all the details, many things were changed, so not changing
> that particular restriction was, IMO, unreasonable.

Actually, we reconsidered all of JCB's restrictions, and we had two
speakers skilled in multiple languages in on the discussion (Tommy and
Gary). But when in doubt, we stuck with JCB's rules as the default.

lojbab




posts: 1912


> All words ultimately are names - names for things, names for concepts.
> (I believe And Rosta took an even stronger stance on this concept - I
> think he wanted to entirely eliminate the distinction between names and
> brivla).

I would be in favour of eliminating the grammatical distinction
between them, so that {lojbab} would be a predicate meaning
"x1 is/are the one(s) named 'lojbab'". But there would still
be two different predicates for:

"x1 is brown in colour" and "x1 is the one named Brown",
say {bunre} and {braun}.

> The name of the color turquoise is "turquoise" in English.

Yes but that's not the name of things of that color, and not
everything named Turquoise need be turquoise. (It may be
the name of a person, for example.)

{me la'o gy Turquoise gy} applies to a person named Turquoise.
{skarnturko} applies to things of a certain bluish green or
greenish blue color. They are not equivalent predicates by a
long shot.

> > But we don't want to say "x1 is/are the one(s) named Turquoise", we
> > want to say that x1 has a certain color. We don't want the named
> > abstraction in x1.
>
> All things of the color turquoise can be "named" "turquoise".

Yes, and anything not of that color can be named that too. But
predicates are a different matter. Saying that something is
turquoise is different from saying that it is named "Turquoise".
{la'o} only works for saying the latter. An appropriate brivla
will let you say the former.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 1912


> > If allowed in Lojban
> > words, it would also be two syllables, I don't think anyone disputes
> > that.
>
> Then you should be writing it "a,e" so we know what you are talking about.

p32: 'Commas are never required: no two Lojban words differ solely
because of the presence or placement of a comma.'

The book also says:
p32 'it is always legal to use the apostrophe sound in pronouncing
a comma'

but we've established that that is not correct, because it would
cause ambiguities.

> nu,iork and kolombi,as

are the same words as {nuiork} and {kolombias}.

> close-comma-marked vowel-pairings have been explicitly permitted in
> names since TLI Loglan days (and indeed I notice that we have examples
> saying that any name with three or more vowels must be paired from the
> left.

  • Must*. Therefore {nui,ork} is the only choice for that triple.


mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912


> What you may be confusing is the rule that says that
> na,iork.
> and
> nai,ork.
>
> are considered the same word because a close-comma is not indicative of
> a distinct word. But it IS indicative of intended pronunciation, all
> the more so because *"nui,ork" should be invalid, while "nu,iork" is valid.

Can the same word be both valid and invalid?

> >
> > You could use {nuuiork} as a better approximation.
>
> Only if you allow uu diphthong after n, which John and I oppose.

{nauiork} then.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 2388

There seems to be a certain philosophic confusion
going on here:



> Jorge Llambías wrote:
> >>Which is why we added in Type III fu'ivla,
> which has a purpose to
> >>specifically answer that problem, as stated
> at the bottom of page 61.
> >
> > You are still missing the point, it seems to
> me. {me la'o gy brown gy}
> > can never mean "x1 is brown in color", not
> because of any polysemy
> > of "brown" but because la'o is for names.
>
> CLL says otherwise, with a reason...
> It may be contradictory, but CLL explicitly
> says that la'o ... and la
> ... are Type I and Type II borrowings. The
> reason why is that all
> borrowings are ultimately names for concepts or
> things, and indeed all
> *words* are names for concepts or things.

Right, more or less. "brown" is the name of the
color brown, "run" is the name of the action
running and so on. But they are not thereby the
name of things which *are* brown or which *do*
run. Now, it may be,for example, that among its
many shifting meanings {me}, "x1 is among the
things to which x2 (a name) applies." This still
does not help much since what follows in the
examples is not a name but the name of a name, as
in {mi se cmene la'o gy John gy}, so the {me}
construction would turn out to apply correctly
only to names of the color brown.
As usual at this point I suggest scrapping all of
this and go to borrowing words directly — if at
all — into a cmene/brivla category whose role is
determined by the rest of the sentence ({doi} and
{la} and the like or {le} and other brivla).
And, of course, do it at all only under duress.

> > That's fine. {turkoise} could also work as a
> fu'ivla for that
> > (although I would consider basing a borrowing
> for a color on an English
> > word very bad form). But {me la'o gy
> turquoise gy} could never
> > mean "x1 is of color turquoise", because it
> means "x1 is/are the one(s)
> > named Turquoise".
>
> All words ultimately are names - names for
> things, names for concepts.

But we want to talk about things other than
concepts and things (in this sense)
and expressions like {la'o gy tourquoise gy} are
not ways to talk about them.

> (I believe And Rosta took an even stronger
> stance on this concept - I
> think he wanted to entirely eliminate the
> distinction between names and
> brivla). The name of the color turquoise is
> "turquoise" in English.
> The real weakness of using a mela'o borrowing
> is that it has a fixed and
> almost meaningless place structure that is
> applicable to all words that
> are names. But most borrowings are used
> precisely according to that
> place-structure - i.e. as "names".
>
> >>Meanwhile, type IV fu'ivla reintroduce the
> same problem, because without
> >>the classifier, no one knows what a krokodilo
> is any more than they
> >>would know what a la'o gy crocodile gy is.
> It might be the name of that
> >>Australian character in the movie Crocodile
> Dundee.
> >>
> >>??krokodilo: x1 is a corny Australian macho
> movie character of
> >>species/breed x2
> >
> > In my experience lojbanists have more common
> sense than that, but
> > yes, in principle it could mean anything.
> {skarnturkoise} could
> > also be the name of some strange foodstuff
> for all I know, because
> > rafsi classifiers are only helpful hints, not
> definitory.
>
> They aren't definitory, but they shouldn't be
> misleading like that.
>
> >>>{la'o} does not just give a sumti, it gives
> a *name*, with no
> >>>semantic content.

Actually, it is the name of a name with only
naming semantic content.

> >>Linnean names are "names" and thus
> appropriately quoted with la'o (and
> >>indeed this is explicitly stated in the
> section on la'o.
> >
> > Yes, but then you need to say {danlu be la'o
> ...}, not just
> > {me la'o ...}.
>
> Why? What else does me la'o mean.
>
> >>Color names
> >>are also names. Just because they aren't
> proper nouns, doesn't make
> >>them non-names.
> >
> > But we don't want to say "x1 is/are the
> one(s) named Turquoise", we
> > want to say that x1 has a certain color. We
> don't want the named
> > abstraction in x1.
>
> All things of the color turquoise can be
> "named" "turquoise".
>



posts: 2388


wrote:

>
> --- Robert LeChevalier wrote:
> > All words ultimately are names - names for
> things, names for concepts.
> > (I believe And Rosta took an even stronger
> stance on this concept - I
> > think he wanted to entirely eliminate the
> distinction between names and
> > brivla).
>
> I would be in favour of eliminating the
> grammatical distinction
> between them, so that {lojbab} would be a
> predicate meaning
> "x1 is/are the one(s) named 'lojbab'". But
> there would still
> be two different predicates for:
>
> "x1 is brown in colour" and "x1 is the one
> named Brown",
> say {bunre} and {braun}.
>
> > The name of the color turquoise is
> "turquoise" in English.
>
> Yes but that's not the name of things of that
> color, and not
> everything named Turquoise need be turquoise.
> (It may be
> the name of a person, for example.)
>
> {me la'o gy Turquoise gy} applies to a person
> named Turquoise.

Well, assuming (usually erroneously)that {me} has
some fixed meaning, it should mean "x1 is an
instance of the word "turquoise." I think your
change is more useful (certainly that "is a paet
of the word"). Still not a very good solution.

> {skarnturko} applies to things of a certain
> bluish green or
> greenish blue color. They are not equivalent
> predicates by a
> long shot.
>
> > > But we don't want to say "x1 is/are the
> one(s) named Turquoise", we
> > > want to say that x1 has a certain color. We
> don't want the named
> > > abstraction in x1.
> >
> > All things of the color turquoise can be
> "named" "turquoise".
>
> Yes, and anything not of that color can be
> named that too. But
> predicates are a different matter. Saying that
> something is
> turquoise is different from saying that it is
> named "Turquoise".
> {la'o} only works for saying the latter. An
> appropriate brivla
> will let you say the former.
>
> mu'o mi'e xorxes
>
>
>
>
> __
> Do you Yahoo!?
> Yahoo! Mail - 250MB free storage. Do more.
> Manage less.
> http://info.mail.yahoo.com/mail_250
>
>
>



posts: 1912


> > {me la'o gy Turquoise gy} applies to a person
> > named Turquoise.
>
> Well, assuming (usually erroneously)that {me} has
> some fixed meaning, it should mean "x1 is an
> instance of the word "turquoise." I think your
> change is more useful (certainly that "is a paet
> of the word"). Still not a very good solution.

No, no, la'o gy Turquoise gy is not the word
"turquoise", it's the one *named* "turquoise".
{zoi} is for quoting foreign words/texts.
{la'o} is for referring to someone or something
named with a foreign word/text. {la'o} is like {la},
{zoi} like {zo}.

mu'o mi'e xorxes






__
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com


posts: 381

In a message dated 2005-02-22 4:03:20 PM Eastern Standard Time,
Llambíasjjllambias2000@yahoo.com.ar writes:


> > Most people who live here say "nu,iork" and "nu,iok" in varying mixtures;
> > I myself am not native, and say "nu,iork" always;
>
> But that would be pronoundced "Nwee-orck" in Lojban (commas are
> indicative only, in this case they indicate incorrectly the
> syllable break).
>

That sounds ludicrous. The comma determines (i.e., forces) the syllable
break, not as a mere indicator of a syllable break. Otherwise the comma has no
function.

stevo

posts: 1912


> In a message dated 2005-02-22 4:03:20 PM Eastern Standard Time,
> Llambíasjjllambias2000@yahoo.com.ar writes:
>
> > > Most people who live here say "nu,iork" and "nu,iok" in varying mixtures;
> > > I myself am not native, and say "nu,iork" always;
> >
> > But that would be pronoundced "Nwee-orck" in Lojban (commas are
> > indicative only, in this case they indicate incorrectly the
> > syllable break).
>
> That sounds ludicrous. The comma determines (i.e., forces) the syllable
> break, not as a mere indicator of a syllable break. Otherwise the comma has
> no function.

It doesn't have any function from the point of view of the morphology:
it is never required, and it can never change a word.

It's only possible function is as a guide for the reader on where
the syllable breaks occur, but the syllable breaks are already
determined by another rule (pairing from the left), so the comma
can do no more than show that more clearly. If it is used against
that rule, it is simply misleading.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 162

Jorge Llambas wrote:
> I would be in favour of eliminating the grammatical distinction
> between them, so that {lojbab} would be a predicate meaning
> "x1 is/are the one(s) named 'lojbab'". But there would still
> be two different predicates for:
>
> "x1 is brown in colour" and "x1 is the one named Brown",
> say {bunre} and {braun}.
>
>>The name of the color turquoise is "turquoise" in English.
>
> Yes but that's not the name of things of that color,

Of course it is. The name of anything is whatever I call it. I call
turquoise things "turquoise" I don't capitalize it, because this isn't
german.

> and not
> everything named Turquoise need be turquoise. (It may be
> the name of a person, for example.)

True. But for nonce use, which is what we are talking about, it is good
enough to say, "this is a concept that is named in some other language
'turquoise'"

> {me la'o gy Turquoise gy} applies to a person named Turquoise.

No. It applies to anything named "Turquoise", which includes the
English language concept "turquoise". If that concept isn't called
"turquoise" we would use some other way of expressing it, which would be
the name instead.

A name is just a word. So is an undefined nonce fu'ivla. They are both
words to name a concept, though the name is a different part of
speech. This is the exact inverse of the "Afraid-of-his-horses" thread
where a description is the same as a name. la turkoise = la'o gy.
turquoise gy. = thing described by the name turquoise

JCB used this in ancient days thereby justifying the addition of "me"
with "me la kraisler karce" which is exactly the same semantic
construction as me la'o gy. turquoise gy. karce
or "ti me la kraisler" in parallel to "ti me la'o gy. turquoise gy."

"This is a Chrysler" means "this is something for which "Chrysler" names
a predicate pertaining to that which is being referred to.

In short, when working at the non-level of abstraction that is
borrowing, the Alice in Wonderland differences between names, what
things are called, and what they are, collapses to a singularity.

> {skarnturko} applies to things of a certain bluish green or
> greenish blue color.

No it doesn't. At best, given the prefix, it applies to things
pertaining in some unspecified way to a color NAME which is Lojbanized
as "turko". As a nonce-fu'ivla, the word has no defined place
structure, other than some association with the name for a concept
"turko". If we had some sort of dikyfu'ivla thing going whereby the
prefix rafsi made a specific kind of claim about the referent (i.e. if
we use skarn- as a prefix then we know that the word is a color, and not
just a term related to colors in some way), then we might be able to
make rules defining place structures. But we can't because borrowing is
potentially open-ended, and we need the classifiers more importantly to
classify the name of the term, not to define its place structure.

> They are not equivalent predicates by a long shot.

At the level of abstraction being used by borrowing, they are. The
English word "turquoise" is not a brivla - it is the adjectival name of
a family of concepts. This is more clear for the English word "run"
which is the name of a family of both noun and verb concepts, not
necessarily having a lot to do with each other. "la'o gy run gy" is
essentially equivalent to the English word "run", i.e. it is some
referent of the string "zoi gy run gy", i.e. "la'e zoi gy run gy"

>>>But we don't want to say "x1 is/are the one(s) named Turquoise", we
>>>want to say that x1 has a certain color. We don't want the named
>>>abstraction in x1.
>>
>>All things of the color turquoise can be "named" "turquoise".
>
> Yes, and anything not of that color can be named that too.

Yep.

> But
> predicates are a different matter. Saying that something is
> turquoise is different from saying that it is named "Turquoise".
> {la'o} only works for saying the latter. An appropriate brivla
> will let you say the former.

A nonce borrowing says NOTHING in Lojban because it isn't really a
Lojban word, and has no Lojban "predicate" meaning. It is merely a
place holder for the name for a concept which we are metaphorically
using as a predicate hoping that our listener who is familiar with the
other-language can grok our intended place structure from the context.

lojbab




posts: 162

Jorge Llambas wrote:
> --- John E Clifford wrote:
>
>>>{me la'o gy Turquoise gy} applies to a person
>>>named Turquoise.
>>
>>Well, assuming (usually erroneously)that {me} has
>>some fixed meaning, it should mean "x1 is an
>>instance of the word "turquoise." I think your
>>change is more useful (certainly that "is a paet
>>of the word"). Still not a very good solution.
>
>
> No, no, la'o gy Turquoise gy is not the word
> "turquoise", it's the one *named* "turquoise".
> {zoi} is for quoting foreign words/texts.
> {la'o} is for referring to someone or something
> named with a foreign word/text. {la'o} is like {la},
> {zoi} like {zo}.

Or in other words, as I said elsewhere, "la'o" is "la'e zoi".





posts: 162

Jorge Llambas wrote:
> --- Bob LeChevalier wrote:
>
>>>If allowed in Lojban
>>>words, it would also be two syllables, I don't think anyone disputes
>>>that.
>>
>>Then you should be writing it "a,e" so we know what you are talking about.
>
>
> p32: 'Commas are never required: no two Lojban words differ solely
> because of the presence or placement of a comma.'
>
> The book also says:
> p32 'it is always legal to use the apostrophe sound in pronouncing
> a comma'
>
> but we've established that that is not correct, because it would
> cause ambiguities.
>
>
>>nu,iork and kolombi,as
>
>
> are the same words as {nuiork} and {kolombias}.

Yes, except that the latter are invalid words without the comma

>>close-comma-marked vowel-pairings have been explicitly permitted in
>>names since TLI Loglan days (and indeed I notice that we have examples
>>saying that any name with three or more vowels must be paired from the
>>left.
>
> *Must*. Therefore {nui,ork} is the only choice for that triple.

False.

Using my alternate choice which is valid either way
"na,iork" = "nai,ork" even though they are pronounced differently
"nu,iork" would be the same word as "nui,ork" if the latter was an
allowed pronunciation in Lojban.

The reason for the doubletalk was for those people who were offended by
the close-comma. I wanted close-commas to be always used except when no
ambiguity was possible. Others wanted to leave them off except when
disambiguation was necessary. In the final analysis since we were
talking only about Lojbanized names, which are themselves only an
approximation to the source language, I said "fine, let it work both
ways", so that it did not matter whether you used the comma or not - it
would be the same word however you pronounced it. But it still had to
be a legal pronunciation. Thus "nai,ai,aim" is the same word as
"na,ia,ia,im" and it doesn't hurt if you leave the commas out, but that
doesn't make "nu'iu,iu,im" the same as the invalid by John's
understanding "nui,ui,uim".

JCB did more or less the same thing that I did, when he said that words
like "nui" could be pronounced "nu,i" or as a single syllable and they
would be the same word. We rejected that alternation except in names
because of the penultimate stress rule. It was too easy to imagine a
lujvo such as "bralui" which could be pronounced "bra,LU,i" or "BRA,lui"
and that was simply gross abuse of a listener. In Lojbanized names,
which are only semi-Lojban, ambiguous syllabification is less of concern
- if you are concerned, you choose a name that is not ambiguous in
syllabification.

We also figured to make the apostrophe and comma interchangeable IN
NAMES, but that got poorly worded on p32 because as with most of the
rules, we simply did not consider the effects of anything we said on
fu'ivla, which have always been an afterthought.

Nora has said on the side, but not had time to write, that she wants
even more restrictions on vowel strings than we now have, and she has
convinced me. She does not want any situation where there are
consecutive identical vowels include a vowel as part of a diphthong.
Thus she doesn't like the currently legal possibility of "au,u" which
she considers as bad as "u,u" (or "a,a" which we recently talked about).
How to word that, she hasn't yet figured out.

The ultimate goal is to have Lojbanized names be as flexible as possible
while still preserving Lojban pronunciability with no stretching of a
Lojbanist's capability to pronounce things beyond those things that are
part of the regular language. Since I think of fu'ivla as merely
another kind of name, per our other discussion, though having the
grammar of a brivla, I would be inclined to use similar rules for Type
III fu'ivla as for names, possibly even with ambiguous stress since they
are intended to be made on the nonce and we want there to be no obstacles.

For Type IV fu'ivla, I would be much stricter - those words have the
time to be made with thought and thus should be able to avoid any sort
of pitfall, approximating the rules for lujvo and cmavo in
restrictiveness based on pronunciation. Thus, even if we have at the
moment made "iglu" work as a Type IV fu'ivla, I would rather see it
banned because the question kept coming up whether it really worked
under all conditions, and the rethinking was painful. The rules for
wordmaking should be as painless as possible.

lojbab




posts: 162

Jorge Llambas wrote:
> --- Bob LeChevalier wrote:
>
>>What you may be confusing is the rule that says that
>>na,iork.
>>and
>>nai,ork.
>>
>>are considered the same word because a close-comma is not indicative of
>>a distinct word. But it IS indicative of intended pronunciation, all
>>the more so because *"nui,ork" should be invalid, while "nu,iork" is valid.
>
> Can the same word be both valid and invalid?

You are losing the distinction between wordform and word

The invalid wordform is invalid, and therefore is not a "word". If you
use "a" instead of "u", you get two wordforms but it is only one word.

The same can be said of current lujvo:

carmymau
carmyzmadu
camymau
camyzmadu
are all equally valid wordforms - they are one and the same "word"

  • cammau
  • camzmadu

are invalid wordforms for the same word, even though if not for the
phonology limitations, we could break them down just as easily.

"nai,ork" and "na,iork" are two valid wordforms for the same word, so
you can leave off the comma and cause no problem. Omitting the comma
means that it will be pronounced the first way, and that is the same
word as the second.

"nu,iork" is valid and *"nui,ork" is invalid, so you cannot omit the
comma unless you are using the alternate orthography.


>>>You could use {nuuiork} as a better approximation.
>>
>>Only if you allow uu diphthong after n, which John and I oppose.
>
>
> {nauiork} then.

that would be valid, and it would be the same name whether it is
nau,iork
na,ui,ork
na,u,iork
nau,i,ork
or
na,u,i,ork

And given the interpretation that was intended for the p32 rule, you
could replace any or all of the commas by apostrophe and it would STILL
be the same name.

But we never intended this sort of wishywashiness to be used in "real
Lojban" words, i.e. brivla. as opposed to Lojbanizations of other
languages. Thus I would tolerate as much of the above as can be made to
work in Type III fu'ivla which are nonce Lojbanizations, but none of it
in Type IVs which should be carefully made and thus can be more restrictive.

lojbab




posts: 1912


> Jorge Llambías wrote:
> > --- John E Clifford wrote:
> >
> >>>{me la'o gy Turquoise gy} applies to a person
> >>>named Turquoise.
> >>
> >>Well, assuming (usually erroneously)that {me} has
> >>some fixed meaning, it should mean "x1 is an
> >>instance of the word "turquoise." I think your
> >>change is more useful (certainly that "is a paet
> >>of the word"). Still not a very good solution.
> >
> >
> > No, no, la'o gy Turquoise gy is not the word
> > "turquoise", it's the one *named* "turquoise".
> > {zoi} is for quoting foreign words/texts.
> > {la'o} is for referring to someone or something
> > named with a foreign word/text. {la'o} is like {la},
> > {zoi} like {zo}.
>
> Or in other words, as I said elsewhere, "la'o" is "la'e zoi".

No, not the same thing. {la'e zoi} uses the meaning of the
quoted word. {la'o} only uses the word as a label.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 162

John E Clifford wrote:
> There seems to be a certain philosophic confusion
> going on here:
> --- Robert LeChevalier <lojbab@lojban.org> wrote:

> Right, more or less. "brown" is the name of the
> color brown, "run" is the name of the action
> running and so on. But they are not thereby the
> name of things which *are* brown or which *do*
> run.

Of course they are. A name is something that someone calls something.
I have heard real English speakers who are talking to people address a
guy in a brown shirt as "Hey, you, Brown, move to the left." It is a
little more problematic with verbs because English grammar gets in the
way, but perhaps the Amerind languages apparently would have no problem.

Now, it may be,for example, that among its
> many shifting meanings {me}, "x1 is among the
> things to which x2 (a name) applies." This still
> does not help much since what follows in the
> examples is not a name but the name of a name, as
> in {mi se cmene la'o gy John gy},

no
la'o gy John gy se cmene zoi gy John gy

> As usual at this point I suggest scrapping all of
> this and go to borrowing words directly — if at
> all — into a cmene/brivla category whose role is
> determined by the rest of the sentence ({doi} and
> {la} and the like or {le} and other brivla).

That was more or less the purpose of adding la'o to the language, but
disambiguity required that we call it a sumti and use me to convert it
to a brivla (and other converters as necessary) for grammatical purposes.

Somehow or another, Jorge got the idea that it was primarily intended
for names, but it was added as part of the fu'ivla stages design.

>>All words ultimately are names - names for
>>things, names for concepts.
>
> But we want to talk about things other than
> concepts and things (in this sense)

I don't know what else there is - I added "concepts" to my formulation
to refer to any referent of a word that wasn't precisely a "thing".

> and expressions like {la'o gy tourquoise gy} are
> not ways to talk about them.

When they are borrowings not yet incorporated into the language, they
are *all* names for concepts or things. There is no semantics other
than what is transferred from the source language, and the Lojbanization
of that semantic transference is completely ad hoc. Borrowed words mean
whatever we want them to mean so long as they communicate. Which is
precisely the same thing we can say about names: you can use whatever
name you want for something so long as you communicate your referent.

lojbab



posts: 1912


> Jorge Llambías wrote:
> >>nu,iork and kolombi,as
> > are the same words as {nuiork} and {kolombias}.
>
> Yes, except that the latter are invalid words without the comma

That's not what CLL says, and that was not the consensus here
when this was discussed a few weeks ago.

Currently, the PEG parser pays no attention to commas, it will
not reject {nuiork} or {kolombias} as invalid.

> The ultimate goal is to have Lojbanized names be as flexible as possible
> while still preserving Lojban pronunciability with no stretching of a
> Lojbanist's capability to pronounce things beyond those things that are
> part of the regular language.

That's why cmene should not have any special rules beyond requiring
a final consonant. Anything else is stretching a Lojbanist's
capability to pronounce things.

> Since I think of fu'ivla as merely
> another kind of name, per our other discussion, though having the
> grammar of a brivla, I would be inclined to use similar rules for Type
> III fu'ivla as for names, possibly even with ambiguous stress since they
> are intended to be made on the nonce and we want there to be no obstacles.

Stress plays no role in cmene, but ee can't allow ambiguous stress
in fu'ivla, even Type III. That should be out of the question. Otherwise
we run into questions like {brodrkAuti}, is that {bro,dr,KA,u ti} or
{bro,dr,KAU,ti}? I refuse to introduce even more complex rules to deal
with such things.

> For Type IV fu'ivla, I would be much stricter - those words have the
> time to be made with thought and thus should be able to avoid any sort
> of pitfall, approximating the rules for lujvo and cmavo in
> restrictiveness based on pronunciation. Thus, even if we have at the
> moment made "iglu" work as a Type IV fu'ivla, I would rather see it
> banned because the question kept coming up whether it really worked
> under all conditions, and the rethinking was painful. The rules for
> wordmaking should be as painless as possible.

{.iglu} doesn't contain anything that doesn't appear in cmavo
and gismu, so it should not cause any problems with pronunciation.

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912



> Somehow or another, Jorge got the idea that it was primarily intended
> for names, but it was added as part of the fu'ivla stages design.

>From the ma'oste:

la'o ZOI the non-Lojban named
delimited non-Lojban name; the resulting quote sumti is treated as a name

maybe that put me in the wrong track?

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 1912


> You are losing the distinction between wordform and word

So your position is that commas should have an impact on the
morphology? i.e. the presence/absence of a comma could change
a wordform from valid to invalid or viceversa?

That would mean making the morphology more complicated, but it
can be done. First of course we need to figure out all the necessary
comma rules that would be involved.

> But we never intended this sort of wishywashiness to be used in "real
> Lojban" words, i.e. brivla. as opposed to Lojbanizations of other
> languages. Thus I would tolerate as much of the above as can be made to
> work in Type III fu'ivla which are nonce Lojbanizations, but none of it
> in Type IVs which should be carefully made and thus can be more restrictive.

At one point I had a separate section of the morphology for
type III fu'ivla, so that syllabic consonants were allowed only
in them, but currently all morphological fu'ivla follow the same
rules. I tend to prefer simpler global rules over specific detailed
complications for each separate word class.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > > {me la'o gy Turquoise gy} applies to a
> person
> > > named Turquoise.
> >
> > Well, assuming (usually erroneously)that {me}
> has
> > some fixed meaning, it should mean "x1 is an
> > instance of the word "turquoise." I think
> your
> > change is more useful (certainly that "is a
> paet
> > of the word"). Still not a very good
> solution.
>
> No, no, la'o gy Turquoise gy is not the word
> "turquoise", it's the one *named* "turquoise".
> {zoi} is for quoting foreign words/texts.
> {la'o} is for referring to someone or something
> named with a foreign word/text. {la'o} is like
> {la},
> {zoi} like {zo}.

Sorry; the text on the list is remarkably
unclear, and the word clearly is a part of the
quotation group. So, meeting someone with such a
name pone says {coi la'o gy turquoise gy} — or,
since {la'o} now seems to be have like {la}, {coi
gy turquoise gy} or (since {gy} is clearly tied
to {la'o}) {coi turquoise}. At some point this
looks like the existing version of what I
proposed earlier though slightly messier and
restricted to names.



posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > > {me la'o gy Turquoise gy} applies to a
> person
> > > named Turquoise.
> >
> > Well, assuming (usually erroneously)that {me}
> has
> > some fixed meaning, it should mean "x1 is an
> > instance of the word "turquoise." I think
> your
> > change is more useful (certainly that "is a
> paet
> > of the word"). Still not a very good
> solution.
>
> No, no, la'o gy Turquoise gy is not the word
> "turquoise", it's the one *named* "turquoise".
> {zoi} is for quoting foreign words/texts.
> {la'o} is for referring to someone or something
> named with a foreign word/text. {la'o} is like
> {la},
> {zoi} like {zo}.
>
I suppose, then, that the introductions runs
something like {mi se cmene zoi gy turquoise gy}.


posts: 2388



> In a message dated 2005-02-22 4:03:20 PM
> Eastern Standard Time,
> Llambíasjjllambias2000@yahoo.com.ar writes:
>
>
> > > Most people who live here say "nu,iork" and
> "nu,iok" in varying mixtures;
> > > I myself am not native, and say "nu,iork"
> always;
> >
> > But that would be pronoundced "Nwee-orck" in
> Lojban (commas are
> > indicative only, in this case they indicate
> incorrectly the
> > syllable break).
> >
>
> That sounds ludicrous. The comma determines
> (i.e., forces) the syllable
> break, not as a mere indicator of a syllable
> break. Otherwise the comma has no
> function.
>
> stevo
That is, no Lojban words differ only by a comma,
but a comma may distinguish one pronunciation of
the same word from another and presumably, when
used (outside discussions like this), indicates
the correct pronunciation. So, {nuiork} is the
same word as {nu,iork} but the second gives the
correct pronunciation. ??


posts: 1912


> So, meeting someone with such a
> name pone says {coi la'o gy turquoise gy}

Yes.

-- or,
> since {la'o} now seems to be have like {la},

No, it doesn't behave like {la} syntactically. It is like
{la} semantically in that it is used to refer to something
by its name.

>{coi
> gy turquoise gy} or (since {gy} is clearly tied
> to {la'o}) {coi turquoise}.

That won't work. If you want to use a selbri after {coi}
instead of a sumti you can say {coi me la'o gy turquoise gy}

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 1912


> I suppose, then, that the introductions runs
> something like {mi se cmene zoi gy turquoise gy}.

Yes, exactly.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Sports - Sign up for Fantasy Baseball.
http://baseball.fantasysports.yahoo.com/


posts: 2388


wrote:

>
> --- MorphemeAddict@wmconnect.com wrote:
> > In a message dated 2005-02-22 4:03:20 PM
> Eastern Standard Time,
> > Llambíasjjllambias2000@yahoo.com.ar writes:
> >
> > > > Most people who live here say "nu,iork"
> and "nu,iok" in varying mixtures;
> > > > I myself am not native, and say "nu,iork"
> always;
> > >
> > > But that would be pronoundced "Nwee-orck"
> in Lojban (commas are
> > > indicative only, in this case they indicate
> incorrectly the
> > > syllable break).
> >
> > That sounds ludicrous. The comma determines
> (i.e., forces) the syllable
> > break, not as a mere indicator of a syllable
> break. Otherwise the comma has
> > no function.
>
> It doesn't have any function from the point of
> view of the morphology:
> it is never required, and it can never change a
> word.
>
> It's only possible function is as a guide for
> the reader on where
> the syllable breaks occur, but the syllable
> breaks are already
> determined by another rule (pairing from the
> left), so the comma
> can do no more than show that more clearly. If
> it is used against
> that rule, it is simply misleading.
>
On the evidence, this last claim is just false.
Hence the claim that syllable division is
determined by a prior rule (and so that commas
can be at best reenforcing) must also be wrong.
I assume that what the rule is is simply
"_lacking another indication_ vowel pairs
diphthong to the left if possible" and so on
(rather like the "stress on the penult" rule for
names, in fact). That the comma is omitted when
the syllabification is "well known" is another
problem, of course.


posts: 2388

While I agree with the point Lojbab is making
here, I really have to protest (in the usual,
"Jeez, why can't logical language people use
logical terminology right?" way) to the playing
around with the ambiguities in words like "name"
that is involved in his argument for the
position. The argument from the inherent
unspecificity of borrowings seems quite adequate
for the task (and the requirement to bring in
"names" merely an unfortunate artefact of the way
that {la'o} was defined and classified).



> Jorge Llambías wrote:
> > I would be in favour of eliminating the
> grammatical distinction
> > between them, so that {lojbab} would be a
> predicate meaning
> > "x1 is/are the one(s) named 'lojbab'". But
> there would still
> > be two different predicates for:
> >
> > "x1 is brown in colour" and "x1 is the one
> named Brown",
> > say {bunre} and {braun}.
> >
> >>The name of the color turquoise is
> "turquoise" in English.
> >
> > Yes but that's not the name of things of that
> color,
>
> Of course it is. The name of anything is
> whatever I call it. I call
> turquoise things "turquoise" I don't
> capitalize it, because this isn't
> german.
>
> > and not
> > everything named Turquoise need be turquoise.
> (It may be
> > the name of a person, for example.)
>
> True. But for nonce use, which is what we are
> talking about, it is good
> enough to say, "this is a concept that is named
> in some other language
> 'turquoise'"
>
> > {me la'o gy Turquoise gy} applies to a person
> named Turquoise.
>
> No. It applies to anything named "Turquoise",
> which includes the
> English language concept "turquoise". If that
> concept isn't called
> "turquoise" we would use some other way of
> expressing it, which would be
> the name instead.
>
> A name is just a word. So is an undefined
> nonce fu'ivla. They are both
> words to name a concept, though the name is a
> different part of
> speech. This is the exact inverse of the
> "Afraid-of-his-horses" thread
> where a description is the same as a name. la
> turkoise = la'o gy.
> turquoise gy. = thing described by the name
> turquoise
>
> JCB used this in ancient days thereby
> justifying the addition of "me"
> with "me la kraisler karce" which is exactly
> the same semantic
> construction as me la'o gy. turquoise gy. karce
> or "ti me la kraisler" in parallel to "ti me
> la'o gy. turquoise gy."
>
> "This is a Chrysler" means "this is something
> for which "Chrysler" names
> a predicate pertaining to that which is being
> referred to.
>
> In short, when working at the non-level of
> abstraction that is
> borrowing, the Alice in Wonderland differences
> between names, what
> things are called, and what they are, collapses
> to a singularity.
>
> > {skarnturko} applies to things of a certain
> bluish green or
> > greenish blue color.
>
> No it doesn't. At best, given the prefix, it
> applies to things
> pertaining in some unspecified way to a color
> NAME which is Lojbanized
> as "turko". As a nonce-fu'ivla, the word has
> no defined place
> structure, other than some association with the
> name for a concept
> "turko". If we had some sort of dikyfu'ivla
> thing going whereby the
> prefix rafsi made a specific kind of claim
> about the referent (i.e. if
> we use skarn- as a prefix then we know that the
> word is a color, and not
> just a term related to colors in some way),
> then we might be able to
> make rules defining place structures. But we
> can't because borrowing is
> potentially open-ended, and we need the
> classifiers more importantly to
> classify the name of the term, not to define
> its place structure.
>
> > They are not equivalent predicates by a long
> shot.
>
> At the level of abstraction being used by
> borrowing, they are. The
> English word "turquoise" is not a brivla - it
> is the adjectival name of
> a family of concepts. This is more clear for
> the English word "run"
> which is the name of a family of both noun and
> verb concepts, not
> necessarily having a lot to do with each other.
> "la'o gy run gy" is
> essentially equivalent to the English word
> "run", i.e. it is some
> referent of the string "zoi gy run gy", i.e.
> "la'e zoi gy run gy"
>
> >>>But we don't want to say "x1 is/are the
> one(s) named Turquoise", we
> >>>want to say that x1 has a certain color. We
> don't want the named
> >>>abstraction in x1.
> >>
> >>All things of the color turquoise can be
> "named" "turquoise".
> >
> > Yes, and anything not of that color can be
> named that too.
>
> Yep.
>
> > But
> > predicates are a different matter. Saying
> that something is
> > turquoise is different from saying that it is
> named "Turquoise".
> > {la'o} only works for saying the latter. An
> appropriate brivla
> > will let you say the former.
>
> A nonce borrowing says NOTHING in Lojban
> because it isn't really a
> Lojban word, and has no Lojban "predicate"
> meaning. It is merely a
> place holder for the name for a concept which
> we are metaphorically
> using as a predicate hoping that our listener
> who is familiar with the
> other-language can grok our intended place
> structure from the context.
>
> lojbab
>
>
>
>
>



On Wednesday 23 February 2005 07:42, Jorge "Llambías" wrote:
> --- MorphemeAddict@wmconnect.com wrote:
> > That sounds ludicrous. The comma determines (i.e., forces) the syllable
> > break, not as a mere indicator of a syllable break. Otherwise the comma
> > has no function.
>
> It doesn't have any function from the point of view of the morphology:
> it is never required, and it can never change a word.
>
> It's only possible function is as a guide for the reader on where
> the syllable breaks occur, but the syllable breaks are already
> determined by another rule (pairing from the left), so the comma
> can do no more than show that more clearly. If it is used against
> that rule, it is simply misleading.

According to chapter 3, {meiin} and {me,iin} are pronounced differently. But
since "no two Lojban words differ solely because of the presence or placement
of a comma," they are the same word.

The only time that the comma is morphologically tricky is when a brivla ends
with a diphthong that has a comma in it. The only word that I've pronounced
this way is {spatrxapio}, which I say either {spa,tr,xa,PI,o} or
{spa,TRXA,pio}. The proper syllabication is {spa,tr,XA,pio} according to
xorxes' syllable rules. camxes ignores the comma completely (which is fine
with me); valfendi does not.

phma
--
Now I need a magnifier to find my eyeglasses!
-Les Perles de la médecine


posts: 1912


> According to chapter 3, {meiin} and {me,iin} are pronounced differently. But
> since "no two Lojban words differ solely because of the presence or placement
> of a comma," they are the same word.

In cmene, syllabification is always irrelevant, so from the point
of view of validating wordforms we can ignore the commas there.
Unless people want {nuiork} and {pier} to be classed as invalid
wordforms, and only {nu,iork} and {pi,er} as valid, but I don't
think that's a good idea.

> The only time that the comma is morphologically tricky is when a brivla ends
> with a diphthong that has a comma in it.

It could also cause problems in the penultimate syllable. If a
penultimate diphthong is split into two syllables, it could signal
the end of the brivla by itself and then the true last syllable
would drop off.

> The only word that I've pronounced
> this way is {spatrxapio}, which I say either {spa,tr,xa,PI,o} or
> {spa,TRXA,pio}. The proper syllabication is {spa,tr,XA,pio} according to
> xorxes' syllable rules. camxes ignores the comma completely (which is fine
> with me); valfendi does not.

This also causes doubts with words like {trio}, which would
be a valid fu'ivla if it had two syllables {tri,o}.

mu'o mi'e xorxes







__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


posts: 14214

On Tue, Jan 04, 2005 at 09:49:35AM -0800, wikidiscuss@lojban.org wrote:
> The current morphology rules say this is a valid fu'ivla:
>
> tstststikptkptsrzgbdbgu

Not anymore, apparently. Nor is it a valid cmene.

I'd like to know what changed, because I would like something on
this level of ridiculousness as a name (of a non-human entity) in
the story I'm working on.

  • grin*


-Robin

--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/


posts: 1912


> On Tue, Jan 04, 2005 at 09:49:35AM -0800, wikidiscuss@lojban.org wrote:
> > The current morphology rules say this is a valid fu'ivla:
> >
> > tstststikptkptsrzgbdbgu
>
> Not anymore, apparently. Nor is it a valid cmene.
>
> I'd like to know what changed, because I would like something on
> this level of ridiculousness as a name (of a non-human entity) in
> the story I'm working on.
>
> *grin*

For initial clusters, the only change has been that now
the "affricates" {ts}, {tc}, {dz} and {dj}, while allowed
as permissible initials themselves, cannot be combined with
anything else in an initial cluster. That's enough to
eliminate all indefinitely long initial clusters.

Medial clusters are still somewhat up in the air as far
as I'm concerned, but the current rule (for maximum complexity)
is:

(syllabic consonant) (consonant sylabic) initial

for a maximum of seven consonants. For example {ambdmjmli}
should be accepted as a valid fu'ivla.

cmene can start with medial clusters too.

mu'o mi'e xorxes







__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 14214

Something seems to be broken.

javnrsoma

("soma rule"; made up) is a non-Lojban-word in camxes.

-Robin


posts: 1912


> Something seems to be broken.
>
> javnrsoma
>
> ("soma rule"; made up) is a non-Lojban-word in camxes.

Could you try again?

I'm not completely confident that the medial consonants filter
is working properly yet. I'm preparing a list of test-words to
add to the test suite so we can test it properly but first I'd
like to have some list discussion. (Once vowels are sorted out.)

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 14214

On Wed, Feb 23, 2005 at 11:15:37AM -0800, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell wrote:
> > Something seems to be broken.
> >
> > javnrsoma
> >
> > ("soma rule"; made up) is a non-Lojban-word in camxes.
>
> Could you try again?

Better. Thanks.

-Robin


Robert LeChevalier scripsit:

> >>You are still missing the point, it seems to me. {me la'o gy brown gy}
> >>can never mean "x1 is brown in color", not because of any polysemy
> >>of "brown" but because la'o is for names.
> >
> >I agree: it means x1 is Brown, not that x1 is brown.
>
> What's the difference, really?

It's the difference between being named "LeChevalier" and actually being
knighted.

--
John Cowan <jcowan@reutershealth.com> http://www.reutershealth.com
"But no living man am I! You look upon a woman. Eowyn I am, Eomund's daughter.
You stand between me and my lord and kin. Begone, if you be not deathless.
For living or dark undead, I will smite you if you touch him."


Pierre Abbat scripsit:

> > PEG takes y to be always unstressed.
>
> Even in cmene?

y is never stressed unless there is no other available vowel, as in
byfyg.

--
Newbies always ask: John Cowan
"Elements or attributes? http://www.ccil.org/~cowan
Which will serve me best?" http://www.reutershealth.com
Those who know roar like lions; jcowan@reutershealth.com
Wise hackers smile like tigers. --a tanka, or extended haiku


Jorge Llamb��)B�as scripsit:

> No, not the same thing. {la'e zoi} uses the meaning of the
> quoted word. {la'o} only uses the word as a label.

la'e zoi gy. the red pony .gy could mean either a pony or a novel;
la'e does not nail down what type of referent is intended.

--
Do what you will, John Cowan
this Life's a Fiction jcowan@reutershealth.com
And is made up of http://www.reutershealth.com
Contradiction. --William Blake http://www.ccil.org/~cowan


posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> > No, not the same thing. {la'e zoi} uses the meaning of the
> > quoted word. {la'o} only uses the word as a label.
>
> la'e zoi gy. the red pony .gy could mean either a pony or a novel;
> la'e does not nail down what type of referent is intended.

Right. So {la'o} is a special case of {la'e zoi}, just as
{cmene} is a special case of {sinxa}.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 162

Jorge Llambas wrote:
>>Or in other words, as I said elsewhere, "la'o" is "la'e zoi".
>
>
> No, not the same thing. {la'e zoi} uses the meaning of the
> quoted word. {la'o} only uses the word as a label.

I'm stating intent. I have no idea what you and others have done in
actual usage. la'o was added for the purpose of type I fu'ivla, and we
considered that a name could be a "label" for the meaning of the name.
It doesn't have to be, but it can be. Since Lojban-as-designed made no
attempt to be semantically sophisticated, whatever distinction you are
seeing between the two was not one that I/we had in mind. A name has as
referent whatever the namer intends to be, which can include the meaning
of the word, and is the case of la'o was specifically intended to do so.

If you want names to exclude the meaning of the names and be semantics
free labels, then I certainly will support adding cmavo that will do
what I intended (and understood) them to be/do.

lojbab




posts: 1912


> If you want names to exclude the meaning of the names and be semantics
> free labels, then I certainly will support adding cmavo that will do
> what I intended (and understood) them to be/do.

{le'o} and {lo'o} could have been used for the foreign word
equivalents of {le} and {lo}, in the way {la'o} corresponds
to {la}, but they're taken, so maybe {le'oi} and {lo'oi}
could be proposed for that as new members of ZOI:

lo'oi: the non-Lojban really-is
le'oi: the non-Lojban described-as
la'o: the non-Lojban named

And then:

loi'o: the non-Lojban mass really-is
lei'o: the non-Lojban mass described-as
lai'o: the non-Lojban mass named

lo'i'o: the non-Lojban set really-is
le'i'o: the non-Lojban set described-as
la'i'o: the non-Lojban set named

lo'e'o: the non-Lojban typical
le'e'o: the non-Lojban stereotypical

Better not. :|

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 162

Jorge Llambas wrote:
> --- Bob LeChevalier wrote:
>
>>Jorge Llambas wrote:
>>
>>>>nu,iork and kolombi,as
>>>
>>>are the same words as {nuiork} and {kolombias}.
>>
>>Yes, except that the latter are invalid words without the comma

Clarifying, they are invalid words without the comma placed or implied
to be at a point where valid Lojban vowels breaks can occur. Thus the
only permissible interpretations of the string "nuiork" are "nU,iork"
and "nu,I,ork" and those two are DEFINED to be indistinguishable as
words, even though they have different stress and syllabification.
because the two alternatives, are considered identical words, we say
that the comma can be omitted.

> That's not what CLL says, and that was not the consensus here
> when this was discussed a few weeks ago.

I have no idea what was discussed, but I was not part of the consensus.

> Currently, the PEG parser pays no attention to commas, it will
> not reject {nuiork} or {kolombias} as invalid.

Fine. They are not invalid, but to pronounce them, you do have to stick
the commas in, and the rule of pairwise from the left does not override
the morphology restrictions on vowel pairs. If the PEG parser drops the
comma from "nu,iork", it doesn't matter which of the two valid ways
you stick the comma(s) back in.

>>The ultimate goal is to have Lojbanized names be as flexible as possible
>>while still preserving Lojban pronunciability with no stretching of a
>>Lojbanist's capability to pronounce things beyond those things that are
>>part of the regular language.
>
> That's why cmene should not have any special rules beyond requiring
> a final consonant. Anything else is stretching a Lojbanist's
> capability to pronounce things.

We seem to have differing understandings of what "no special rules" are.

>>Since I think of fu'ivla as merely
>>another kind of name, per our other discussion, though having the
>>grammar of a brivla, I would be inclined to use similar rules for Type
>>III fu'ivla as for names, possibly even with ambiguous stress since they
>>are intended to be made on the nonce and we want there to be no obstacles.
>
> Stress plays no role in cmene, but ee can't allow ambiguous stress
> in fu'ivla, even Type III. That should be out of the question. Otherwise
> we run into questions like {brodrkAuti}, is that {bro,dr,KA,u ti} or
> {bro,dr,KAU,ti}? I refuse to introduce even more complex rules to deal
> with such things.

I'll buy that, if necessary, but then we should a) remove the rule that
says that the close comma is meaningless, since you've made it
meaningful, and b) use the more restrictive set of valid vowel pairings
- ai/au/ei/oi/ always allowed, ia/ie/ii/io/iu and ua/ue/ui/uo/uu allowed
only at the start of a syllable, iy and uy negotiably part of the latter
in wordforms that allow y. Any of the pairs allowed with apostrophe can
also occur with close-comma which is required when intended except in
alternate morphology, so "ea" as a pair always must be written "e,a" (or
"e'a")

Nora wants the added restriction, which it looks like you also want,
that a glide cannot separate to identical vowels, so both ai,i and ii,i
would not be permitted, and e'e may be allowed, but not e,e. I'll agree
in general but I'm not sure what this means for ai,ia.

>>For Type IV fu'ivla, I would be much stricter - those words have the
>>time to be made with thought and thus should be able to avoid any sort
>>of pitfall, approximating the rules for lujvo and cmavo in
>>restrictiveness based on pronunciation. Thus, even if we have at the
>>moment made "iglu" work as a Type IV fu'ivla, I would rather see it
>>banned because the question kept coming up whether it really worked
>>under all conditions, and the rethinking was painful. The rules for
>>wordmaking should be as painless as possible.
>
> {.iglu} doesn't contain anything that doesn't appear in cmavo
> and gismu, so it should not cause any problems with pronunciation.

The classical question was whether it passes morphologically, not
pronunciation, or whether for some following wordform XX..XX,
..igluXX..XX
can ambiguously break as
..i gluXX..XX

..e.g ".i glunanmu"

JCB argued that the stress on the "i" is sufficient to keep the "glu"
from attaching to the next word, and Nora and I and many others have
usually not been convinced.

lojbab




posts: 162

John E Clifford wrote:
> While I agree with the point Lojbab is making
> here, I really have to protest (in the usual,
> "Jeez, why can't logical language people use
> logical terminology right?" way) to the playing
> around with the ambiguities in words like "name"
> that is involved in his argument for the
> position. The argument from the inherent
> unspecificity of borrowings seems quite adequate
> for the task (and the requirement to bring in
> "names" merely an unfortunate artefact of the way
> that {la'o} was defined and classified).

If I've made myself clear to you as to the whats, whys and wherefores of
my position, feel free to tell us what it is using proper logical
terminology %^)

As the guy who designed a logical language more than a decade after
(essentially) flunking logic class, I can't pretend to be qualified.

lojbab



posts: 162

John Cowan wrote:
> Robert LeChevalier scripsit:
>
>>>>You are still missing the point, it seems to me. {me la'o gy brown gy}
>>>>can never mean "x1 is brown in color", not because of any polysemy
>>>>of "brown" but because la'o is for names.
>>>
>>>I agree: it means x1 is Brown, not that x1 is brown.
>>
>>What's the difference, really?
>
>
> It's the difference between being named "LeChevalier" and actually being
> knighted.

LeChevalier could mean that the referent is knighted, in French. So far
as I know, the knighted Jean would be called "Jean LeChevalier" just as
someone who had inherited that as a surname. (It is plausible that in
medieval times the surname was granted only to hereditary knights, but I
haven't gotten anywhere near that far back in my genealogy.)

lojbab



posts: 14214

On Wed, Feb 23, 2005 at 04:45:19PM -0500, Robert LeChevalier wrote:
> LeChevalier could mean that the referent is knighted, in French.
> So far as I know, the knighted Jean would be called "Jean
> LeChevalier" just as someone who had inherited that as a surname.

Sure, but small l, small c, and a space before the c.

-Robin, who is mostly ignoring this thread.


posts: 2388


<rlpowell@digitalkingdom.org> wrote:

> On Tue, Jan 04, 2005 at 09:49:35AM -0800,
> wikidiscuss@lojban.org wrote:
> > The current morphology rules say this is a
> valid fu'ivla:
> >
> > tstststikptkptsrzgbdbgu
>
> Not anymore, apparently. Nor is it a valid
> cmene.
>
> I'd like to know what changed, because I would
> like something on
> this level of ridiculousness as a name (of a
> non-human entity) in
> the story I'm working on.
>
> *grin*
>
So use it. It would at least finally give a
reason for actually dealing with such horrors.


posts: 1912


> >>
> >>>>nu,iork and kolombi,as
> >>>are the same words as {nuiork} and {kolombias}.
> >>
> >>Yes, except that the latter are invalid words without the comma
>
> Clarifying, they are invalid words without the comma placed or implied
> to be at a point where valid Lojban vowels breaks can occur. Thus the
> only permissible interpretations of the string "nuiork" are "nU,iork"
> and "nu,I,ork" and those two are DEFINED to be indistinguishable as
> words, even though they have different stress and syllabification.
> because the two alternatives, are considered identical words, we say
> that the comma can be omitted.

CLL doesn't just say that a comma can be omitted. It says it is
never required.

> > Currently, the PEG parser pays no attention to commas, it will
> > not reject {nuiork} or {kolombias} as invalid.
>
> Fine. They are not invalid, but to pronounce them, you do have to stick
> the commas in, and the rule of pairwise from the left does not override
> the morphology restrictions on vowel pairs. If the PEG parser drops the
> comma from "nu,iork", it doesn't matter which of the two valid ways
> you stick the comma(s) back in.

It doesn't drop them, it just pays no attention to them.
{nu,iork}, {nui,ork}, {n,u,i,o,r,k}, {,,,n,,ui,o,,r,k} are
all treated alike. Commas change nothing in the parse.

> > Stress plays no role in cmene, but ee can't allow ambiguous stress
> > in fu'ivla, even Type III. That should be out of the question. Otherwise
> > we run into questions like {brodrkAuti}, is that {bro,dr,KA,u ti} or
> > {bro,dr,KAU,ti}? I refuse to introduce even more complex rules to deal
> > with such things.
>
> I'll buy that, if necessary, but then we should a) remove the rule that
> says that the close comma is meaningless, since you've made it
> meaningful,

Huh? How have I made it meaningful? The parser ignores all commas,
they're meaningless.

> and b) use the more restrictive set of valid vowel pairings
> - ai/au/ei/oi/ always allowed, ia/ie/ii/io/iu and ua/ue/ui/uo/uu allowed
> only at the start of a syllable, iy and uy negotiably part of the latter
> in wordforms that allow y. Any of the pairs allowed with apostrophe can
> also occur with close-comma which is required when intended except in
> alternate morphology, so "ea" as a pair always must be written "e,a" (or
> "e'a")

I don't like the idea of requiring commas, and making them equivalent
to apostrophes even less.

> Nora wants the added restriction, which it looks like you also want,
> that a glide cannot separate to identical vowels, so both ai,i and ii,i
> would not be permitted, and e'e may be allowed, but not e,e. I'll agree
> in general but I'm not sure what this means for ai,ia.

I don't particularly want any restriction for vowels, I would
be happy with allowing every sequence. What I do want is that if
we make some restriction, it should apply globally, to all words.

> > {.iglu} doesn't contain anything that doesn't appear in cmavo
> > and gismu, so it should not cause any problems with pronunciation.
>
> The classical question was whether it passes morphologically, not
> pronunciation, or whether for some following wordform XX..XX,
> .igluXX..XX
> can ambiguously break as
> .i gluXX..XX
>
> .e.g ".i glunanmu"
>
> JCB argued that the stress on the "i" is sufficient to keep the "glu"
> from attaching to the next word, and Nora and I and many others have
> usually not been convinced.

Whyever not? It is exactly the same situation of {kiglu}.

If the glottal stop was considered a consonant (one that could
not form clusters and could only appear at the beginning of
words), then {.iglu} would have gismu-form. There is nothing
problematic about it.

mu'o mi'e xorxes





__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 2388



> Jorge Llambías wrote:
> >>Or in other words, as I said elsewhere,
> "la'o" is "la'e zoi".
> >
> >
> > No, not the same thing. {la'e zoi} uses the
> meaning of the
> > quoted word. {la'o} only uses the word as a
> label.
>
> I'm stating intent. I have no idea what you
> and others have done in
> actual usage. la'o was added for the purpose of
> type I fu'ivla, and we
> considered that a name could be a "label" for
> the meaning of the name.
> It doesn't have to be, but it can be. Since
> Lojban-as-designed made no
> attempt to be semantically sophisticated,
> whatever distinction you are
> seeing between the two was not one that I/we
> had in mind. A name has as
> referent whatever the namer intends to be,
> which can include the meaning
> of the word, and is the case of la'o was
> specifically intended to do so.
>
> If you want names to exclude the meaning of the
> names and be semantics
> free labels, then I certainly will support
> adding cmavo that will do
> what I intended (and understood) them to be/do.
>
This sort of stuff always gets my head spinning
so that I make mistakes when working through it.
Let's take a relatively clear case. {la'o gy Gone
with the Wind gy} is a name and refers to a
certain book about events around the Civil War.
{la'e lao gy Gone with the Wind gy}refers to
whatever it is that the book refers to, probably
some of those events (there is a lot of leeway
here). {la'o gy turquoise gy} is a name and
refers to whatever it is tht the person using the
name is using it to name (Grice would require
that that have something to do turquoise, but
even then just what is pretty much up for grabs
and, since we are moving from one language to
another this restriction is considerably weakened
for, as in some Japanese ads, English words might
be used in Lojban just for their appeal as forms,
even in ignorance of their meaning). {la'e la'o
gy turquoise gy} refers to whatever turquoise
(the thing called "turquoise" by the user of the
{la'o} expression) refers to. And here the field
is pretty wide open: the stones are long life and
good luck and Lord knows what else; the concept
probably refers to the stones and the book refers
to geological processes and lapidary techniques
and the girl maybe stands for one hot night and a
round of clap. And so on. So then, {zoi gy
turquoise gy} refers to the word "turquoise" as a
linguistic or physical or ... object. {la'e zoi
gy turquoise gy} refers to whatever that word is
used to refer to and, since it is a name in
Lojban (we are assuming), whatever the person who
is using that name intends it to refer to. Which
seems to get back to what {la'o gy turquoise gy}
refers to.
I admit I had convinced myself otherwise the last
time I tried this and can't yet see where I went
wrong then, but this one feels more correct
throughout. I await some cdirection if this is
wrong.


posts: 162

Robin Lee Powell wrote:
> On Wed, Feb 23, 2005 at 04:45:19PM -0500, Robert LeChevalier wrote:
>
>>LeChevalier could mean that the referent is knighted, in French.
>>So far as I know, the knighted Jean would be called "Jean
>>LeChevalier" just as someone who had inherited that as a surname.
>
>
> Sure, but small l, small c, and a space before the c.

My understanding is that, in France, my family name was written all
manner of ways with and without the c and with and without the
capitalization, but in fact the capitalization is what indicates
knighthood and not merely being a horseman. My grandfather disowned one
son who dropped the Capitalized "Le" from the front of the name in part
because of the strength of the family tradition of its indication of
nobility. But I don't know more about the accuracy of the legend than that.

lojbab




posts: 1912


> I admit I had convinced myself otherwise the last
> time I tried this and can't yet see where I went
> wrong then, but this one feels more correct
> throughout. I await some cdirection if this is
> wrong.

For example:

la'o gy turquoise gy no'u lo mlatu cu blabi
Turquoise, the cat, is white.

ta me la'o gy turquoise gy gi'e nai ku'i me la'e zoi gy turquoise gy
She's Turquoise, but not turquoise.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > I admit I had convinced myself otherwise the
> last
> > time I tried this and can't yet see where I
> went
> > wrong then, but this one feels more correct
> > throughout. I await some cdirection if this
> is
> > wrong.
>
> For example:
>
> la'o gy turquoise gy no'u lo mlatu cu blabi
> Turquoise, the cat, is white.
>
> ta me la'o gy turquoise gy gi'e nai ku'i me
> la'e zoi gy turquoise gy
> She's Turquoise, but not turquoise.
>
I'm not sure what this is meant to be. As a
counterexample it fails because it implies an
ambiguity in "turquoise" as imported into Lojban,
once as a name for a cat and once as a "name for
a color."
{lemi mlatu se cmene zoi gy turquoise gy ijenai
ku'i my me la'e zoi gy turquoise gy} either makes
the ambiguity overt or survives on some other
peculiarity of {me}.


posts: 1912


> I'm not sure what this is meant to be.

I'm trying to show that {me la'o gy turquoise gy}
is as bad a predicate for "x1 is turquoise" as
{me la blanu} would be for "x1 is blue".

{me la'e zoi gy turquoise gy} is not all that better,
it is like {me la'e zo blanu}, but at least it allows
for "turquise"/"blanu" not being taken as names.

The clear predicates are {blanu} and {skarnturko}
or similar. Something that does not involve names
or quoted words.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


Jorge Llamb��)B�as scripsit:

> For example:
>
> la'o gy turquoise gy no'u lo mlatu cu blabi
> Turquoise, the cat, is white.
>
> ta me la'o gy turquoise gy gi'e nai ku'i me la'e zoi gy turquoise gy
> She's Turquoise, but not turquoise.

Your examples are compelling, but let's consider the example in CLL
chapter 19:

la'o dy. Goethe .dy. me la'o ly. Homo sapiens .ly.
Goethe is a Homo sapiens

Now that seems to use me la'o ... in the sense of "x1 is a Homo sapiens"
rather than "x1 is named 'Homo Sapiens'". Goethe was named 'Goethe',
not 'Homo sapiens'.

This may reflect the former (vague) use of {me} rather than anything
else, but it needs to be taken into account in clarifying exactly how
{la'o} works.

--
Not to perambulate John Cowan <jcowan@reutershealth.com>
the corridors http://www.reutershealth.com
during the hours of repose http://www.ccil.org/~cowan
in the boots of ascension. --Sign in Austrian ski-resort hotel


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > I'm not sure what this is meant to be.
>
> I'm trying to show that {me la'o gy turquoise
> gy}
> is as bad a predicate for "x1 is turquoise" as
> {me la blanu} would be for "x1 is blue".
>
> {me la'e zoi gy turquoise gy} is not all that
> better,
> it is like {me la'e zo blanu}, but at least it
> allows
> for "turquise"/"blanu" not being taken as
> names.
>
> The clear predicates are {blanu} and
> {skarnturko}
> or similar. Something that does not involve
> names
> or quoted words.
>
Of course, as soon as I ent this I noticed I had
forgotten the {cu} between {mlatu} and {se
cmene}. thanks for not calling me on that.

As for the main point, it depends upon what {la'o
gy turquoise gy} — and, for that matter, {la
blanu} — name. And that is (up to social
conventions, which we don't know for Lojbanistan)
pretty much speaker's choice in the beginning.
Speaker declares that he is using a foreign word
-- without a Lojban content and with an
unspecified native content — as a name. For
what? Hopefully the context will clarify that
somewhat, but the possibilitiews are everywhere
from a meaningless label (as apparently is the
case with the cat) to some of the possible
objects associated semantically with the word at
home: a concept, a property (or relation), a
class of things (perhaps several more or less
separate classes even), events, social
institutions, and so on almost indefintely. In
that sense, using a borrowed word is always a
risky business, but it has a couple of practical
advantages that may make it worthhwile to take
the risk and pay the price (in having to add
explanations to be understood): local color and
immediate access rather than having to
reconstruct the notion in Lojban. Once it has
been introduced as a name for something, {me la'o
gy turquoise gy} is going to get that thing (if
it is not a monopole) or some members of it (if
it is a monopole). And so will {me la'e zoi gy
turquoise gy}. These are bad predicates for "x1
is turquoise" (in any sense of that word: color,
stone, or whatever it may be) because we are not
given what the referent of "turquoise" is here in
such a way that know whether it is a brivloid
intended rather than a cmenoid or conversely. As
you note, the same problem arises (in theory, at
least) with {me la blanu}, since {la blanu} can
be the name of the mass of blue things or of
blueness itself or of somehting that is in no way
blue, depending on the namer's choice.
As I have said elsewhere, the solution to this
problem (if it really is one — we are talking
about very occasional expressions that can always
be expanded in the case of confusion and so may
not be worth working over) is to make the
borrowings something other than names, whose
grammar makes for this line of difficulty.
Better they should be brivla, though still marked
as foreign (and thus free from any kind of
phonological restrictions). They can still serve
as names, of course, since brivla can go after
{la}, but other wise they will have the general
meaning of "x1 is a sociosemantic relatum of the
quoted word in that word's native culture." This
is no clearer about the meaning in the crucial
case but clearer about the role that is intended
eventually (it also gets rid — one hopes — of
the overextended sense of "name," which, while
not wrong exactly is clearly misleading). We
could do this merely by changing the status of
{la'o} expressions, but there are reasons
(involving longer expressions, for example) for
doing something simpler (I suggested {iy...uy}
demarcaters but that is just a frinstance of the
sort of thing to be desired).


posts: 1912


> Your examples are compelling, but let's consider the example in CLL
> chapter 19:
>
> la'o dy. Goethe .dy. me la'o ly. Homo sapiens .ly.
> Goethe is a Homo sapiens
>
> Now that seems to use me la'o ... in the sense of "x1 is a Homo sapiens"
> rather than "x1 is named 'Homo Sapiens'". Goethe was named 'Goethe',
> not 'Homo sapiens'.
>
> This may reflect the former (vague) use of {me} rather than anything
> else, but it needs to be taken into account in clarifying exactly how
> {la'o} works.

If we want to say what species Goethe belongs to, I think:

la'o dy. Goethe .dy. danlu la'o ly. Homo sapiens .ly.
Goethe is a (specimen of) Homo sapiens

would be more appropriate. With {me} we are saying that Goethe is
one of the referents of 'Homo sapiens', which might be more appropriate
in a context such as an interplanetary convention where Goethe was
the human representative, then when they ask for the position of
Homo sapiens on some issue, Goethe, i.e. Homo sapiens, would
offer their position. Then we could say {la'o dy. Goethe .dy.
cusku li'o} or {la'o ly. Homo sapiens .ly. cusku li'o}.

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 2388


Way of topic, but should "turquoise" be nativized
to Lojban, what would be the base form used? My
immediate family offers all of /turkoiz/,
/turkuaz/, and /turkuois/ (a triphtong that I do
not think can be further syllabified). And all
of these occur w9ith the variants haveing /trk/
in place of /turk/. Others/ Criteria for deciding?


posts: 2388


wrote:

>
> --- John Cowan wrote:
> > Your examples are compelling, but let's
> consider the example in CLL
> > chapter 19:
> >
> > la'o dy. Goethe .dy. me la'o ly. Homo
> sapiens .ly.
> > Goethe is a Homo sapiens
> >
> > Now that seems to use me la'o ... in the
> sense of "x1 is a Homo sapiens"
> > rather than "x1 is named 'Homo Sapiens'".
> Goethe was named 'Goethe',
> > not 'Homo sapiens'.
> >
> > This may reflect the former (vague) use of
> {me} rather than anything
> > else, but it needs to be taken into account
> in clarifying exactly how
> > {la'o} works.
>
> If we want to say what species Goethe belongs
> to, I think:
>
> la'o dy. Goethe .dy. danlu la'o ly. Homo
> sapiens .ly.
> Goethe is a (specimen of) Homo sapiens
>
> would be more appropriate. With {me} we are
> saying that Goethe is
> one of the referents of 'Homo sapiens', which
> might be more appropriate
> in a context such as an interplanetary
> convention where Goethe was
> the human representative, then when they ask
> for the position of
> Homo sapiens on some issue, Goethe, i.e. Homo
> sapiens, would
> offer their position. Then we could say {la'o
> dy. Goethe .dy.
> cusku li'o} or {la'o ly. Homo sapiens .ly.
> cusku li'o}.
>
This seems yet another move in the fluid {me},
from "an instance of/member of/case of..." to "a
reresentative (instance/member/case?) of." I
w8ish someone would either nail this Jell-o (tm)
to the tree or admit that it is no less vague
than it used to be and change the official line accordingly.


Jorge Llamb��)B�as scripsit:

> If we want to say what species Goethe belongs to, I think:
>
> la'o dy. Goethe .dy. danlu la'o ly. Homo sapiens .ly.
> Goethe is a (specimen of) Homo sapiens
>
> would be more appropriate. With {me} we are saying that Goethe is
> one of the referents of 'Homo sapiens',

Well, so he is, interplanetary convention or not. "Behind the desk
there is a Homo sapiens, a Rattus norvegicus, and innumerable instances
of Staphylococcus aureus." What is clear from this sentence is that
"me lao ly. Homo sapiens .ly." does *not* mean "is named 'Homo sapiens'".

--
John Cowan jcowan@reutershealth.com www.ccil.org/~cowan
Female celebrity stalker, on a hot morning in Cairo:
"Imagine, Colonel Lawrence, ninety-two already!"
El Auruns's reply: "Many happy returns of the day!"


posts: 1912


> As I have said elsewhere, the solution to this
> problem (if it really is one — we are talking
> about very occasional expressions that can always
> be expanded in the case of confusion and so may
> not be worth working over) is to make the
> borrowings something other than names, whose
> grammar makes for this line of difficulty.
> Better they should be brivla, though still marked
> as foreign (and thus free from any kind of
> phonological restrictions).

If they were borrowed as brivla, then they would indeed
form a sequence with type III and type IV fu'ivla. As
names, they don't really.

> They can still serve
> as names, of course, since brivla can go after
> {la}, but other wise they will have the general
> meaning of "x1 is a sociosemantic relatum of the
> quoted word in that word's native culture."

I have no idea what a sociosemantic relatum of a word
is, but if blue things are the sociosemantic relata
of the word "blue", then yes.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> With {me} we are saying that Goethe is
> > one of the referents of 'Homo sapiens',
>
> Well, so he is, interplanetary convention or not. "Behind the desk
> there is a Homo sapiens, a Rattus norvegicus, and innumerable instances
> of Staphylococcus aureus." What is clear from this sentence is that
> "me lao ly. Homo sapiens .ly." does *not* mean "is named 'Homo sapiens'".

To me, it does. More precisely: "is among those named 'Homo sapiens'"

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


posts: 1912


> This seems yet another move in the fluid {me},
> from "an instance of/member of/case of..." to "a
> reresentative (instance/member/case?) of." I
> w8ish someone would either nail this Jell-o (tm)
> to the tree or admit that it is no less vague
> than it used to be and change the official line accordingly.

I don't know why I bother, but anyway:

me <sumti>: x1 is/are among the referents of "<sumti>".

mu'o mi'e xorxes





__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 1912



>
> Way of topic, but should "turquoise" be nativized
> to Lojban, what would be the base form used?

Much more on topic than the rest of this thread. At
least this has to do with morphology.

> My
> immediate family offers all of /turkoiz/,
> /turkuaz/, and /turkuois/ (a triphtong that I do
> not think can be further syllabified). And all
> of these occur w9ith the variants haveing /trk/
> in place of /turk/. Others/ Criteria for deciding?

It would never even occur to me that this word should be
borrowed from English, I see absolutely no reason for
doing that. But if what you're asking is which forms
would be of acceptable fu'ivla-form, then {turkoize},
{turkuaze} and {turkuoise} would all be accepted by the
current PEG. But admissible vowel clusters are still
under discussion so one or all of them may end up not
being allowed.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > As I have said elsewhere, the solution to
> this
> > problem (if it really is one — we are
> talking
> > about very occasional expressions that can
> always
> > be expanded in the case of confusion and so
> may
> > not be worth working over) is to make the
> > borrowings something other than names, whose
> > grammar makes for this line of difficulty.
> > Better they should be brivla, though still
> marked
> > as foreign (and thus free from any kind of
> > phonological restrictions).
>
> If they were borrowed as brivla, then they
> would indeed
> form a sequence with type III and type IV
> fu'ivla. As
> names, they don't really.
>
> > They can still serve
> > as names, of course, since brivla can go
> after
> > {la}, but other wise they will have the
> general
> > meaning of "x1 is a sociosemantic relatum of
> the
> > quoted word in that word's native culture."
>
> I have no idea what a sociosemantic relatum of
> a word
> is, but if blue things are the sociosemantic
> relata
> of the word "blue", then yes.
>
Inter alia and in all the meanings of "blue"
(color, emotion, smut, and Lord knows what else).

But the sociosemantic relata include not just the
denotations but also the designations (the color
blue and the property blueness and the concept of
blueness and maybe even the the propositional
function "is blue") and the connotative entries;
maleness for example and (in America nowadays)
political liberalism (in the American sense)and
hypo-oxemia and ... . Borrowing is till risky,
in short.


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > This seems yet another move in the fluid
> {me},
> > from "an instance of/member of/case of..." to
> "a
> > reresentative (instance/member/case?) of." I
> > w8ish someone would either nail this Jell-o
> (tm)
> > to the tree or admit that it is no less vague
> > than it used to be and change the official
> line accordingly.
>
> I don't know why I bother, but anyway:
>
> me <sumti>: x1 is/are among the referents of
> "<sumti>".
>
And I don't know why you can't stick to what you
specify. There is nothing here to make the case
of Goethe as a political representative of the
species a more appropriate use of {me la'o ly
Homo sapiens ly} that the simple descriptive case.


Jorge Llamb��)B�as scripsit:

> To me, it does. More precisely: "is among those named 'Homo sapiens'"

So you think that "Homo sapiens" names the members of the species, individually
considered?

--
Why are well-meaning Westerners so concerned that John Cowan
the opening of a Colonel Sanders in Beijing means jcowan@reutershealth.com
the end of Chinese culture? ... We have had http://www.reutershealth.com
Chinese restaurants in America for over a century, http://www.ccil.org/~cowan
and it hasn't made us Chinese. On the contrary,
we obliged the Chinese to invent chop suey. --Marshall Sahlins


posts: 2388


wrote:

>
> --- John E Clifford wrote:
>
> >
> > Way of topic, but should "turquoise" be
> nativized
> > to Lojban, what would be the base form used?
>
> Much more on topic than the rest of this
> thread. At
> least this has to do with morphology.

Yes, we do seem to have wandered a bit. Sorry
about that, but we got there legitimately by
trying 1) to find a way to deal with some
phonologically troublesome words and then 2)
working out some consequences these suggestions.
This is none of that.

> > My
> > immediate family offers all of /turkoiz/,
> > /turkuaz/, and /turkuois/ (a triphtong that I
> do
> > not think can be further syllabified). And
> all
> > of these occur w9ith the variants haveing
> /trk/
> > in place of /turk/. Others/ Criteria for
> deciding?
>
> It would never even occur to me that this word
> should be
> borrowed from English, I see absolutely no
> reason for
> doing that. But if what you're asking is which
> forms
> would be of acceptable fu'ivla-form, then
> {turkoize},
> {turkuaze} and {turkuoise} would all be
> accepted by the
> current PEG. But admissible vowel clusters are
> still
> under discussion so one or all of them may end
> up not
> being allowed.
>
I can't imagine borrowing "turquoise" either --
until I imagine trying to talk about it without
planning a word in advance. But I really can't
imagine making a permanent Lojban word of it (or
hardly any other non Lojban word). But it has
somehow become the example here and so I want to
pursue it. The issue is not what form would be
legal (now or under certain conditions) but the
prior one: what English form would we use as a
basis for constructing the Lojban form?


posts: 1912


> And I don't know why you can't stick to what you
> specify. There is nothing here to make the case
> of Goethe as a political representative of the
> species a more appropriate use of {me la'o ly
> Homo sapiens ly} that the simple descriptive case.

Goethe the rep. votes yes, "Homo sapiens voted yes"
makes sense to me.

Goethe votes for the Republicans, "Homo sapiens voted
for the Republicans" makes no sense to me.

But it doesn't matter. This has nothing to do with
morphology, and we've already discussed and voted
for the definition of {la'o}. If people want to
change it, they should ask to re-open that section.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Sports - Sign up for Fantasy Baseball.
http://baseball.fantasysports.yahoo.com/


posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> > To me, it does. More precisely: "is among those named 'Homo sapiens'"
>
> So you think that "Homo sapiens" names the members of the species,
> individually
> considered?

I think it names the members when seen as one ("Mr Member").

I don't think I would say {la'o ly Homo sapiens ly danlu
la'o ly Homo sapiens ly}.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


Jorge Llamb��)B�as scripsit:

> I have written an informal description of the PEG morphology
> algorithm (especially for Pierre and Nora that wanted to see
> something more than the bare formal grammar rules). You can
> read it starting from here:

Thanks for this.

> If some parts need more clarification, please tell me.

The term "short-lujvo" should be defined on the "Morphology: rafsi"
page, but isn't.

> (BTW, I have removed cmene-rafsi and cmavo-rafsi from the
> morphology, given the underwhelming reception they got.
> I still kept Pierre's fu'ivla rafsi and my general brivla
> rafsi however, because I think they are useful and blend
> much better with the rest. With the removal of cmavo-rafsi
> I now allow V'y and y'V clusters in cmavo forms, which are
> explicitly mentioned in CLL.)

Good. I'm not thrilled with the general brivla rafsi; however, I am
leaving the extended rafsi until a later pass.

> I'm quite satisfied with the permissible consonant cluster
> restrictions as implemented. I'm not yet very happy with
> medial clusters, and I still have my doubts about vowel
> clusters in general.

I'd like to propose the following restrictions on syllables. They apply
to all types of words. I am using the terms onset, nucleus, coda in
more standard ways than your informal description does.

A nucleus is a single V, or y, or ai/ei/oi/au, or a syllabic consonant.
An onset is an initial-cluster, or h, or glide i/u, or zero. A coda is
a single consonant or zero, except in the final syllable of a cmene,
where it may contain multiple consonants (rules as yet undefined).

An initial syllable may have a consonantal or glide or zero onset, but not
h. A non-initial syllable may have a consonantal or glide or h onset,
but not zero. A glide onset may be followed only by a single V or y.
A diphthong may have only a zero or single consonant onset.
A syllabic consonant may be preceded only by a single-C onset.

This would ban non-diphthong vowel pairs and CCIV (I=i/u), as well as CIV
when initial. Affected fu'ivla are: cipnrxuazine, jinmrniobi, kriofla,
malminiata, mianma, spatr-/stagrleoxari, mandioka, kulnr-/bangrkorea,
saskrkuarka.

--
Mark Twain on Cecil Rhodes: John Cowan
"I admire him, I freely admit it, http://www.ccil.org/~cowan
and when his time comes I shall http://www.reutershealth.com
buy a piece of the rope for a keepsake." jcowan@reutershealth.com


posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > And I don't know why you can't stick to what
> you
> > specify. There is nothing here to make the
> case
> > of Goethe as a political representative of
> the
> > species a more appropriate use of {me la'o ly
> > Homo sapiens ly} that the simple descriptive
> case.
>
> Goethe the rep. votes yes, "Homo sapiens voted
> yes"
> makes sense to me.
>
> Goethe votes for the Republicans, "Homo sapiens
> voted
> for the Republicans" makes no sense to me.
>
> But it doesn't matter. This has nothing to do
> with
> morphology, and we've already discussed and
> voted
> for the definition of {la'o}. If people want to
> change it, they should ask to re-open that
> section.
>

I should have thought that, when they did, they
decided that it was the name of a species and
thus {me la'o ly Homo sapiens ly} would mean "is
a member of the species Hs", species being a
bunch of things as much as "the three kings" is.


posts: 1912


> Jorge Llamb��)B�as scripsit:
> > If some parts need more clarification, please tell me.
>
> The term "short-lujvo" should be defined on the "Morphology: rafsi"
> page, but isn't.

Oops. Fixed.

A short-lujvo is a stressed-initial-rafsi followed by a short-final-rafsi.
For example {jbopli}, or {bastygau}. It is called short because its final
rafsi is short, a single syllable: CCV or CVV with diphthong.

> I'd like to propose the following restrictions on syllables. They apply
> to all types of words. I am using the terms onset, nucleus, coda in
> more standard ways than your informal description does.

I think the proposal is very elegant. It seems extremely restrictive
though.

> A nucleus is a single V, or y, or ai/ei/oi/au, or a syllabic consonant.
> An onset is an initial-cluster, or h, or glide i/u, or zero. A coda is
> a single consonant or zero, except in the final syllable of a cmene,
> where it may contain multiple consonants (rules as yet undefined).
>
> An initial syllable may have a consonantal or glide or zero onset, but not
> h. A non-initial syllable may have a consonantal or glide or h onset,
> but not zero. A glide onset may be followed only by a single V or y.

Disallowing diphthongs after glide seems unmotivated, but OK.

> A diphthong may have only a zero or single consonant onset.

No h or clusters? What about things like {mu'ei}, {glauka}?

> A syllabic consonant may be preceded only by a single-C onset.
>
> This would ban non-diphthong vowel pairs and CCIV (I=i/u), as well as CIV
> when initial. Affected fu'ivla are: cipnrxuazine, jinmrniobi, kriofla,
> malminiata, mianma, spatr-/stagrleoxari, mandioka, kulnr-/bangrkorea,
> saskrkuarka.

And a very long list of cmene, too.

How would New York end up with this very strict morphology? {nujork}?

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Sports - Sign up for Fantasy Baseball.
http://baseball.fantasysports.yahoo.com/


posts: 1912


> A nucleus is a single V, or y, or ai/ei/oi/au, or a syllabic consonant.
> An onset is an initial-cluster, or h, or glide i/u, or zero. A coda is
> a single consonant or zero, except in the final syllable of a cmene,
> where it may contain multiple consonants (rules as yet undefined).
>
> An initial syllable may have a consonantal or glide or zero onset, but not
> h. A non-initial syllable may have a consonantal or glide or h onset,
> but not zero. A glide onset may be followed only by a single V or y.
> A diphthong may have only a zero or single consonant onset.
> A syllabic consonant may be preceded only by a single-C onset.

Reading more carefully, I notice that {nu,iork} is not excluded,
and similarly {smacrkoba,iu} and {tropa,iolo}. Is that right?

mu'o mi'e xorxes




__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo


posts: 162

Jorge Llambas wrote:
> But it doesn't matter. This has nothing to do with
> morphology, and we've already discussed and voted
> for the definition of {la'o}. If people want to
> change it, they should ask to re-open that section.

I think this gets to the key point. I reject that any topic should be
considered "closed" if and when it comes up for debate in later
discussing some other topic, in which we might realize that something
might not have been considered in the original discussion (I don't in
fact know what was and wasn't considered, not having any real idea how
to find anything that specific other than by searching the several
thousand messages I've received in the last year on wiki updates and
discussions, which is an intimidating prospect.)

I also have no idea what rules our fearless jatna has set down for
reopening topics.

The issue seems to be both the meaning of la'o (and la) given that they
were intended as part of the hierarchy of type N borrowings, and in
addition, whether perhaps as Cowan has suggested, it is the meaning of
"me" that has changed, possibly making "me la'o" no longer usable for
the borrowing function.

It is a morphology question not in terms of the algorithm, per se,
because the larger questions of "borrowing" and "morphology" probably
don't fit into the little categories that cmavo space was divided into
for wiki discussion purposes.

Since he's said he isn't reading the thread, I explicitly copy Robin on
this message for his direction on how to proceed with this issue (type I
and II borrowing with la'o and la, and possibly me).




posts: 14214

On Thu, Feb 24, 2005 at 07:13:16PM -0500, Robert LeChevalier wrote:
> Jorge Llamb?as wrote:
> >But it doesn't matter. This has nothing to do with morphology,
> >and we've already discussed and voted for the definition of
> >{la'o}. If people want to change it, they should ask to re-open
> >that section.
>
> I think this gets to the key point. I reject that any topic
> should be considered "closed" if and when it comes up for debate
> in later discussing some other topic,
snip unnecessary verbosity

It would be nice if, at some point, you read the BPFK procedures, as
well as the checkpoint page.

But I suppose that since you can rely on my correcting you when you
make shit up like this, you don't need to.

In this case the thing you're missing (among others) is on
http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Checkpoints

Far future: Pre-Rump Mega-Vote

At some point, the BPFK needs to declare itself finished with
producing cmavo definitions and whatever else it ends up doing.
IOW, it needs to get to the point where the entire group can
look at the language and say, "OK, *that* is a well-specified
language". This is not to occur until after every section has
been in a completed checkoint. It then devolves into a rump
committee for future unforseen emergencies.

When this blessed time period seems to have arrived, the jatna
will call for the Pre-Rump Mega-Vote. This will be a
non-time-limited mass discussion in which every single section
of the language is open for debate, to continue until consensus
minus one is reached. The goal is to iron out any outstanding
conflicts between sections. Hopefully this won't take long. The
jatna reserves the right to place time limits on "No" votes
without reasons attached, i.e. if you claim that you simply
need more time to read up, and everyone else is done, the jatna
may give you a time limit of some kind.

-Robin


posts: 2388


wrote:

>
> --- John Cowan <jcowan@reutershealth.com>
> wrote:
> > Jorge Llamb��)B�as scripsit:
> >
> > > To me, it does. More precisely: "is among
> those named 'Homo sapiens'"
> >
> > So you think that "Homo sapiens" names the
> members of the species,
> > individually
> > considered?
>
> I think it names the members when seen as one
> ("Mr Member").
>
> I don't think I would say {la'o ly Homo sapiens
> ly danlu
> la'o ly Homo sapiens ly}.
>
Sheesh! Even though this is not about {lo} I
suppose I should not be surprised that another
fluid concept under xorxes' says-so should turn
up here: if {me} can Mr. be far behind. I do not
see at the moment how Mr. Hs helps matters (nor
how it is what {la'o ly Homo sapiens ly}names nor
how it is the member viewed as one. I do agree
that the species is not an animal of that species
(though I seem to recall that Mr. Hs would be),
but surely {ro da poi me la'o ly Homo sapiens ly
danlu la'o ly Homo sapiens ly} is correct — and
is thecase in question after all.


Jorge Llamb��)B�as scripsit:

> I think the proposal is very elegant. It seems extremely restrictive
> though.

The intention is not to overextend the phonotactics of "core Lojban" (that is,
the Lojban of gismu, cmavo, and lujvo made from standard rafsi).

> Disallowing diphthongs after glide seems unmotivated, but OK.

On reflection, I'm okay with those.

> > A diphthong may have only a zero or single consonant onset.
>
> No h or clusters? What about things like {mu'ei}, {glauka}?

Those are okay too. (I was half confusing diphthongs with glides, that's all.)

> And a very long list of cmene, too.

Yes.

> Reading more carefully, I notice that {nu,iork} is not excluded,
> and similarly {smacrkoba,iu} and {tropa,iolo}. Is that right?

Indeed. And if you pronounce them "-kobai,u" and "-pai,olo", no one is
going to stop you. This de facto replaces the left-pairing rule with
a right-pairing one, as Loglan has also done (for different reasons).
Since pairing distinctions don't affect word identity, that should be
no problem.

--
Ambassador Trentino: I've said enough. I'm a man of few words.
Rufus T. Firefly: I'm a man of one word: scram!
--Duck Soup John Cowan <jcowan@reutershealth.com>


posts: 162

Robin Lee Powell wrote:
> On Thu, Feb 24, 2005 at 07:13:16PM -0500, Robert LeChevalier wrote:
>
>>Jorge Llamb?as wrote:
>>
>>>But it doesn't matter. This has nothing to do with morphology,
>>>and we've already discussed and voted for the definition of
>>>{la'o}. If people want to change it, they should ask to re-open
>>>that section.
>>
>>I think this gets to the key point. I reject that any topic
>>should be considered "closed" if and when it comes up for debate
>>in later discussing some other topic,
>
> snip unnecessary verbosity
>
> It would be nice if, at some point, you read the BPFK procedures, as
> well as the checkpoint page.

I've read them several times, though not recently. Finding anything on
the tiki when I am responding to email in realtime isn't practical.

> But I suppose that since you can rely on my correcting you when you
> make shit up like this, you don't need to.

I made nothing up, and indeed what you post does not address my issue.

> In this case the thing you're missing (among others) is on
> http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Checkpoints
>
> Far future: Pre-Rump Mega-Vote

That's the point. The issue has come up NOW, in a certain context that
will likely go away whenever discussion of fu'ivla in the context of the
morphology algorithm ends. In the far future, it is unlikely that
anyone will even remember that this discussion took place.

If you reread the first line above, Jorge said we should ask to reopen
la'o, implying that he also thought it was possible to do so. I see no
procedure prior to that "far future" event for reopening anything. It
sounds like you are saying that is indeed the case, which means that I
did understand correctly, even though I think it "should" be otherwise.

In addition, I'm not sure that there is even a byfy "checkpoint" section
on the chapter of the morphology that discusses the progression of Type
I-IV borrowings, which is what the issue really is. Originally none of
the morphology was part of the checkpoint system but was a separate
subcommittee under Nora. Then out of the blue you gave us 2 weeks
notice for a vote on the PEG morphology algorithm, which raises issues
about the morphology as a whole, and that in turn brought up the side
issue on la'o.

There is no checkpoint on the morphology rules themselves, only on the
PEG algorithm (or am I mistaken). The morphology algorithm must follow
from the morphology rules, and it seems like the reverse is taking
place, such that the rules end up being whatever the PEG algorithm
determines them to be, which is ass-backwards. (This is really Nora's
issue, and she is preparing to post on it, but it seems fit to mention
it in context of the last point).

lojbab



posts: 1912


> If you reread the first line above, Jorge said we should ask to reopen
> la'o, implying that he also thought it was possible to do so.

It never hurts to ask.

I can't believe you would want to redefine la'o though, given that
it was given basically the same definition it has in the ma'oste.

> There is no checkpoint on the morphology rules themselves, only on the
> PEG algorithm (or am I mistaken).

What is the difference between the algorithm and the rules? Shouldn't
they be the exact same thing written in different languages/formalisms?

> The morphology algorithm must follow
> from the morphology rules, and it seems like the reverse is taking
> place, such that the rules end up being whatever the PEG algorithm
> determines them to be, which is ass-backwards.

There is now a full informal description of the algorithm, or rules,
available, so no need to panic. See:
<http://www.lojban.org/tiki/tiki-index.php?page=Informal+description+of+the+PEG+morphology+algorithm>

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> > I think the proposal is very elegant. It seems extremely restrictive
> > though.
>
> The intention is not to overextend the phonotactics of "core Lojban" (that
> is,
> the Lojban of gismu, cmavo, and lujvo made from standard rafsi).

Yes, and after a more careful reading, it is not as restrictive
as it appeared to me at first. The important restriction is on
consonant+glide+vowel, which can be circumvented either by inserting
an apostrophe or another vowel. I guess I can live with that.

> > Disallowing diphthongs after glide seems unmotivated, but OK.
>
> On reflection, I'm okay with those.

Excellent. That gives us the four triphthongs:
iai, iau, iei, ioi, uai, uau, uei, uoi

> > > A diphthong may have only a zero or single consonant onset.
> >
> > No h or clusters? What about things like {mu'ei}, {glauka}?
>
> Those are okay too. (I was half confusing diphthongs with glides, that's
> all.)

I will proceed to implement these restrictions then.
What do others think about this?

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


posts: 23

Thanks, Jorge, for your description of the PEG algorithm. That should help
me understand a bit better what's going on there.

However, I'm still back at square one. What I expect from a morphology
algorithm is not merely that it breaks down a sequence uniquely. I could
do that by separating into "words" at each and every letter. That does not
make it correct. In order to determine if the break-down is correct, we
need to go back to the beginning. We need to consider what the speaker (of
a valid lojban utterance) actually meant to say. If he/she has said that
utterance in conformance with lojban valid grammar, word-formation and
pronunciation rules, then we should get back out of the algorithm a
breakdown of the words that is exactly what that person intended, and no
other. Always. If two different people, intending two different
utterances, can both produce the same speech stream, and both are in
accordance with all the rules (like, in English: "I scream" and "Ice
cream"), then something is broken - EVEN IF the morphology algorithm
consistently breaks it down the same way; that would just mean that the
morphology algorithm was not exhaustive.

To check that the algorithm will always break into what the utterer meant,
we need the word *formation* rules (like: where a pause is required, where
it's allowed, where it's not allowed; the rules on lujvo building; the
correct forms of cmavo, fu'ivla and names).

To prove that there is only one possible breakdown, we need commentary on
the reasoning for each step. And that commentary should make reference to
the word formation rules to rule out other breakdowns. For example (from
the old morphology algorithm, draft 4.0): "...You want to break it
up into words. First, break at all pauses (cannot pause in the middle of a
word)."

What I want first - before I will be able to evaluate that the algorithm
looks good - is that full statement of the rules of formation of words, and
of combination of words into utterances. "A gismu must be of the for
CVC/CV or CCVCV where CC is ...", "A pause is required: after a name;
before a name unless preceeded by la, lai or la'i, before a vowel-initial
word; ...". Many of these have been defined before, but some are part of
the controversy we've been having now: the valid fu'ivla forms, the valid
consonant clusters in fu'ivla and names, the valid vowel combinations in
the expanded cmavo.

Let me give an example of something where, if we had missed a rule, the
algorithm would go on it's merry way and break the utterance down
consistently, but it would not be unique. Let's suppose we didn't realize
that a pause was between a stressed cmavo and a gismu, so we never made
that rule. Therefore, the programmer(s) never added a check for it. Now
lets take 2 speakers, both following all the rules that they would
have. One wants to say "re spabrucla" with emphasis on the "re"
(/rEspabrUcla/), and the other one wants to say "respa brucla"
(/rEspabrUcla/). We feed the first one into the PEG that was created
without checking for stressed-cmavo-before-brivla. Since we're checking
for a cmavo to break off first in PEG, it does, and we get "re spabrucla";
person #1 is happy. Then we feed the second person's utterance into
PEG. It gets the same thing - "re spabrucla", and person #2 is very
unhappy. (A cry goes up from the community - "we can't have this - there
must be only one breakdown". The byfy says: "we voted in the PEG
algorithm, so what it says goes". The people say "but then how DO you say
"respa brucla" - do we have to pause???". Etc.). The parenthetical is
just for grins. The REAL problem is that the discrepancy might not be noticed.

I understand that, in working on the PEG algorithm program, you come up
with a good idea of what might be allowed (and still have a unique
breakdown) and what might not. But "can be uniquely broken down" does not
mean "should be allowed". This is where I disagree with some of what
Pierre has suggested. If the rules are too complex, no one will get them
right. As an example, take "aslinku'i". From the point of view of "can",
it obviously "can" be a fu'ivla: the "a" cannot fall off because "slinku'i"
is not a valid gismu or fu'ivla. From the point of view of a speaker - or,
even worse, a listener - this is triple-think. First thought: "the 'a'
breaks off and the rest is a ... um ... fu'ivla (because it isn't a lujvo
because: sli+nk - no, not a rafsi initial; slink - no, not a
CCVCy). Second thought: "wait a minute - it can't be a fu'ivla because it
fails slinku'i test". Third thought: "oh - if it can't be a fu'ivla, then
I guess the 'a' might not be able to fall off". If a listener had to got
through all of this, the speaker would already be on the next topic.

A comment on the PEG algorithm itself. (Umm... how to say this?) We don't
need an algorithm to help us break down lojban text; text has unspoken
rules that there are spaces between all words except for a string of only
cmavo, where it's optional. We need the algorithm to break down a speech
stream. Of course, the algorithm will be working on a representation of a
speech stream in text, but that's not the same thing. In particular,
why-all is there a mention of a "digit"? Please - speak the digit 2, and
then transcribe what you said; you got "re", didn't you. To add "digit"
to the rules (for checking cmene, before we ever check for cmavo)
PRE-SUPPOSES a previous break-out of a cmavo. Is there a pre-PEGGER that
recognizes "re" and changes it to "2"? If so, what rules does it
use? Other mentions that pertain to text are not as destructive, but it
gives me the feeling that you are not taking into consideration that there
is a speaker and a listener involved - that it is sounds that we are
resolving. This makes me a little doubtful of results when you start
discussing valid consonant clusters and such; if they are just a string of
letters to you, and not a string of sounds, you may miss why they were not
allowed in the first place.


mi'e noras noras@cox.net
Nora LeChevalier




Nora LeChevalier scripsit:

> If two different people, intending two different
> utterances, can both produce the same speech stream, and both are in
> accordance with all the rules (like, in English: "I scream" and "Ice
> cream"), then something is broken - EVEN IF the morphology algorithm
> consistently breaks it down the same way; that would just mean that the
> morphology algorithm was not exhaustive.

This concern is absolutely legitimate, and it's true that a PEG algorithm
isn't easily used for generation, only for parsing (hence its name).
The only reasonable approach is to work on thoroughly understanding the
algorithm and then convincing yourself it's correct.

To that end, I've been urging xorxes to simplify it, removing bells and whistles.

> To check that the algorithm will always break into what the utterer meant,
> we need the word *formation* rules (like: where a pause is required, where
> it's allowed, where it's not allowed; the rules on lujvo building; the
> correct forms of cmavo, fu'ivla and names).

So far I've been trying to develop the rules for legal syllables, which xorxes
has incorporated into the algorithm. This was an area that was woefully
underspecified until now.

One area that's quite open is what should be allowed finally in a cmene.
Currently it's C, CC, CCC, C/C, C/CC, or C/CCC.

> In particular, why-all is there a mention of a "digit"? Please -
> speak the digit 2, and then transcribe what you said; you got "re",
> didn't you. To add "digit" to the rules (for checking cmene, before
> we ever check for cmavo) PRE-SUPPOSES a previous break-out of a cmavo.
> Is there a pre-PEGGER that recognizes "re" and changes it to "2"?
> If so, what rules does it use?

In fact, "2" can also occur in the written form of names, as in the given
example "2005nan", which is just a way of writing "renonomunan".
However, "re'i" cannot be written "2'i". So "2" is equivalent to "re"
in some uses but not others.

--
John Cowan jcowan@reutershealth.com http://www.ccil.org/~cowan
Is it not written, "That which is written, is written"?


On Thu, 24 Feb 2005 18:38:52 -0800 (PST), Jorge Llambías
<jjllambias2000@yahoo.com.ar> wrote:
>
> Excellent. That gives us the four triphthongs:
> iai, iau, iei, ioi, uai, uau, uei, uoi

For certain large values of "four".

zo'o mu'o mi'e .filip.
--

Philip Newton <philip.newton@gmail.com>



posts: 1912


> To check that the algorithm will always break into what the utterer meant,
> we need the word *formation* rules (like: where a pause is required, where
> it's allowed, where it's not allowed; the rules on lujvo building; the
> correct forms of cmavo, fu'ivla and names).

OK, I'll try to do that, but I don't think much has changed in
that respect.

I took it as a given that we all know the formation rules, and
that what was missing was an algorithm to break an arbitrary
string into words. I will try to write down the formation rules
as you want, but they are the ones you are already familiar with.

> I understand that, in working on the PEG algorithm program, you come up
> with a good idea of what might be allowed (and still have a unique
> breakdown) and what might not. But "can be uniquely broken down" does not
> mean "should be allowed".

Agreed. That's why I commented profusely on the bits where the algorithm
is more permissive than the traditional rules.

I have removed now the possibility of having a finally stressed cmavo
not followed by a pause. That was unambiguous, but probably the
rules were more complicated than they're worth.

I still think it's useful to be able to not pause after Cy cmavo
in many cases, so I will try to sort out exactly which cases
don't require a pause.

> This is where I disagree with some of what
> Pierre has suggested. If the rules are too complex, no one will get them
> right. As an example, take "aslinku'i". From the point of view of "can",
> it obviously "can" be a fu'ivla: the "a" cannot fall off because "slinku'i"
> is not a valid gismu or fu'ivla. From the point of view of a speaker - or,
> even worse, a listener - this is triple-think. First thought: "the 'a'
> breaks off and the rest is a ... um ... fu'ivla (because it isn't a lujvo
> because: sli+nk - no, not a rafsi initial; slink - no, not a
> CCVCy). Second thought: "wait a minute - it can't be a fu'ivla because it
> fails slinku'i test". Third thought: "oh - if it can't be a fu'ivla, then
> I guess the 'a' might not be able to fall off". If a listener had to got
> through all of this, the speaker would already be on the next topic.

That's a very good point. Type IV fu'ivla are not meant to be
made on the fly though, so you just learn the word aslinku'i without
doing any analysis.

The rule I have used for fu'ivla is basically "any string of syllables
without y, with penultimate stress, that can't break as
cmavo and lujvo even when preceded by a cmavo". That's the
traditional definition. If we are to exclude {aslinku'i}, we need
to add more rules. I'm neither for nor against at this point.

> In particular,
> why-all is there a mention of a "digit"? Please - speak the digit 2, and
> then transcribe what you said; you got "re", didn't you. To add "digit"
> to the rules (for checking cmene, before we ever check for cmavo)
> PRE-SUPPOSES a previous break-out of a cmavo. Is there a pre-PEGGER that
> recognizes "re" and changes it to "2"?

No, there's no pre-PEG.

I agree recognizing digits is just fancy stuff, but it doesn't cost
much, and it is a frequent convention in representing text, so...
Digits are allowed both in cmene (used sometimes as {la 2005nan.}
or as cmavo {li 2005}. They are not allowed in brivla. They are
dealt with as the corresponding syllable in both cases. They count
as _unstressed_ syllables, so {12broda} will break in the same
way as {parebroda}.

> Other mentions that pertain to text are not as destructive, but it
> gives me the feeling that you are not taking into consideration that there
> is a speaker and a listener involved - that it is sounds that we are
> resolving. This makes me a little doubtful of results when you start
> discussing valid consonant clusters and such; if they are just a string of
> letters to you, and not a string of sounds, you may miss why they were not
> allowed in the first place.

I don't think that's the case. I am very much taking into
account the sounds. That's why tctctctc bothered me as a
permissible initial for example, or why disallowing {mz}
while allowing {mj} makes so little sense to me. It's not
the letters, it's the sounds.

mu'o mi'e xorxes






__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 1912


> So far I've been trying to develop the rules for legal syllables, which
> xorxes
> has incorporated into the algorithm. This was an area that was woefully
> underspecified until now.

I haven't fully incorporated them yet, but I'm in the process of doing it.

One thing about that: I'm planning to disallow a syllable that starts
with a glide following directly after a diphthong. You may or may not
have intended to include this restriction but I think it was not
made explicit. For example, {aiia} will not be allowed because it's
too similar to {aia}. {auia} also will be disallowed. What do you think?

In fact the only allowed vocalic strings will be of the form
{I?VIV...IVIVI?}, i.e. any number of alternating vowels and glides,
but no two vowels ever adjacent and no two glides ever adjacent.

mu'o mi'e xorxes






__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


Jorge Llamb��)B�as scripsit:

> One thing about that: I'm planning to disallow a syllable that starts
> with a glide following directly after a diphthong. You may or may not
> have intended to include this restriction but I think it was not
> made explicit. For example, {aiia} will not be allowed because it's
> too similar to {aia}. {auia} also will be disallowed. What do you think?

Agreed.

> In fact the only allowed vocalic strings will be of the form
> {I?VIV...IVIVI?}, i.e. any number of alternating vowels and glides,
> but no two vowels ever adjacent and no two glides ever adjacent.

Sounds right to me.

--
On the Semantic Web, it's too hard to prove John Cowan jcowan@reutershealth.com
you're not a dog. --Bill de hOra http://www.ccil.org/~cowan


posts: 23

At 05:12 AM 2/25/05 -0800, jorge wrote:

>--- Nora LeChevalier wrote:
snip
> > This is where I disagree with some of what
> > Pierre has suggested. If the rules are too complex, no one will get them
> > right. As an example, take "aslinku'i". From the point of view of "can",
> > it obviously "can" be a fu'ivla: the "a" cannot fall off because
> "slinku'i"
> > is not a valid gismu or fu'ivla. From the point of view of a speaker -
> or,
> > even worse, a listener - this is triple-think. First thought: "the 'a'
> > breaks off and the rest is a ... um ... fu'ivla (because it isn't a lujvo
> > because: sli+nk - no, not a rafsi initial; slink - no, not a
> > CCVCy). Second thought: "wait a minute - it can't be a fu'ivla because it
> > fails slinku'i test". Third thought: "oh - if it can't be a fu'ivla, then
> > I guess the 'a' might not be able to fall off". If a listener had to got
> > through all of this, the speaker would already be on the next topic.
>
>That's a very good point. Type IV fu'ivla are not meant to be
>made on the fly though, so you just learn the word aslinku'i without
>doing any analysis.
>
>The rule I have used for fu'ivla is basically "any string of syllables
>without y, with penultimate stress, that can't break as
>cmavo and lujvo even when preceded by a cmavo". That's the
>traditional definition. If we are to exclude {aslinku'i}, we need
>to add more rules. I'm neither for nor against at this point.

snip
We need a formal definition of fu'ivla. You are using essentially the
definition from CLL, which is slightly flawed:
"... must not be gismu or lujvo, or any combination of cmavo,
gismu, and lujvo; furthermore, a fu'ivla with a CV cmavo joined to the
front of it must not have the form of a lujvo."

Note that the definition as given does NOT exclude "catci'ile" from being a
fu'ivla. You don't (nor does CCL) mention that a fu'ivla can't break up
into a cmavo plus a *fu'ivla*. If you don't have careful definitions of
what limits the word-forms, then your programming may miss something. I
can't comment further on anything you may be missing because the
"basically" implies you've left a lot out (like needing a consonant
cluster, having penultimate stress, etc.)


--
mi'e noras noras@cox.net
Nora LeChevalier




posts: 1912


> At 05:12 AM 2/25/05 -0800, jorge wrote:
> >The rule I have used for fu'ivla is basically "any string of syllables
> >without y, with penultimate stress, that can't break as
> >cmavo and lujvo even when preceded by a cmavo". That's the
> >traditional definition. If we are to exclude {aslinku'i}, we need
> >to add more rules. I'm neither for nor against at this point.
>
> snip
> We need a formal definition of fu'ivla. You are using essentially the
> definition from CLL, which is slightly flawed:
> "... must not be gismu or lujvo, or any combination of cmavo,
> gismu, and lujvo; furthermore, a fu'ivla with a CV cmavo joined to the
> front of it must not have the form of a lujvo."
>
> Note that the definition as given does NOT exclude "catci'ile" from being a
> fu'ivla. You don't (nor does CCL) mention that a fu'ivla can't break up
> into a cmavo plus a *fu'ivla*.

You're right, I didn't mention it here, but I did take it into
account in the algorithm. Here is the strict definition
(taken from
<http://www.lojban.org/tiki/tiki-index.php?page=Morphology%3A+fu%27ivla>)

A string begins with a fu'ivla if it does not begin with a cmavo
or a rafsi-string or a slinku'i, it begins with a permissible initial,
and consists of any number of unstressed-syllables (possibly none)
followed by one stressed-syllable and then a final-syllable

A rafsi-string consists of any number of y-less-rafsi (possibly none)
followed by a gismu, a CVV-final-rafsi, or a stressed-y-less-rafsi
and a short-final-rafsi

A slinku'i consists of a consonant followed by a rafsi-string.

> If you don't have careful definitions of
> what limits the word-forms, then your programming may miss something.

I think I have used the careful definitions we all know.

> can't comment further on anything you may be missing because the
> "basically" implies you've left a lot out (like needing a consonant
> cluster, having penultimate stress, etc.)

I did mention the penultimate stress. The consonant cluster
is not necessary for the definition, it is a consequence of
not being decomposable in cmavo plus something else.

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 14214

On Thu, Feb 24, 2005 at 11:23:16PM -0500, Nora LeChevalier wrote:
> Always. If two different people, intending two different
> utterances, can both produce the same speech stream, and both are
> in accordance with all the rules (like, in English: "I scream" and
> "Ice cream"), then something is broken - EVEN IF the morphology
> algorithm consistently breaks it down the same way; that would
> just mean that the morphology algorithm was not exhaustive.

I'm sorry, that just doesn't make any sense to me.

If the morphology algorithm always breaks the phrase down one way,
and the morphology algorithm is the standard for correctness, than
any other way to break it down is merely wrong.

I must be missing something, because what you just said sounds to me
like, "Well, if you say "mi klama zarci" and I hear "mikla mazarci",
then the morphology algorithm is broken!". I just don't get what
you're trying to say.

-Robin


posts: 14214

On Thu, Feb 24, 2005 at 06:29:33PM -0800, Jorge Llamb?as wrote:
>
> --- Robert LeChevalier wrote:
> > If you reread the first line above, Jorge said we should ask to
> > reopen la'o, implying that he also thought it was possible to do
> > so.
>
> It never hurts to ask.

The short answer to this issue:

We've re-opened sections due to observed conflicts more than once in
the past, and the procedures specifically deal with this issue (and
have since late 2003).

The long answer will have to wait; I am so mad at Bob and Nora that
I'm literally shaking as I type.

-Robin


posts: 1912


This is John's beautiful definition of syllables (with a couple
of small adjustments as approved by him):

- A nucleus is a single V, or y, or ai/ei/oi/au, or a syllabic consonant.
- An onset is an initial-cluster, or a single consonant, or h,
or glide i/u, or zero.
- A coda is a single consonant or zero

- An initial syllable may have a consonantal or glide or zero onset,
but not h.
- A non-initial syllable may have a consonantal or glide or h onset,
but not zero.
- A syllabic consonant may be preceded only by a single-C onset, and
followed only by a zero coda.
- A syllable with ai/ei/oi/au nucleus and zero coda cannot be followed
by a syllable with glide onset.

Now, given that, it is clear that a glide onset is in many respects
like a consonant. This suggests that it should not be necessary to
require a glottal stop in front of it when in initial position. If
we don't require such glottal stops, then we have that {ieieie} for
example will parse as {ie ie ie}. This has the advantage that
cmavo-forms won't admit long strings of vowels unless separated by h,
without the need to forbid them just because (the longest string
allowed will be the initial triphthongs, otherwise only single
vowels or ai/ei/oi/au between h's.

It also has the advantage of making the not so natural
glottal+glide cluster (at least less natural than
glottal+vowel) not obligatory.

Should I implement it like that?

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


Robin Lee Powell scripsit:

> If the morphology algorithm always breaks the phrase down one way,
> and the morphology algorithm is the standard for correctness, than
> any other way to break it down is merely wrong.

That's fine when we have accepted the morphology algorithm, but right
now we are trying to see if it's correct.

In particular, that means that for each sequence of valid words that
we create and glue together (with stress and pause marked explicitly),
the morphological parser always recovers the original sequence and not
some alternative. The PEG grammar cannot be mechanically checked to
see whether this happens or not: only close reasoning supplemented with
exhaustive testing can do that.

What we need now is a big test bed of sequences of valid Lojban words
with stress and pause properly marked.

> I must be missing something, because what you just said sounds to me
> like, "Well, if you say "mi klama zarci" and I hear "mikla mazarci",
> then the morphology algorithm is broken!". I just don't get what
> you're trying to say.

No, it means that if we feed the morphological parser "miklAmazArci"
and it parses it as "mikla mazarci", it's broken.

--
John Cowan www.reutershealth.com www.ccil.org/~cowan jcowan@reutershealth.com
Arise, you prisoners of Windows / Arise, you slaves of Redmond, Wash,
The day and hour soon are coming / When all the IT folks say "Gosh!"
It isn't from a clever lawsuit / That Windowsland will finally fall,
But thousands writing open source code / Like mice who nibble through a wall.
--The Linux-nationale by Greg Baker


posts: 162

Robin Lee Powell wrote:
> The long answer will have to wait; I am so mad at Bob and Nora that
> I'm literally shaking as I type.

If every time I decide to participate I get you that angry, perhaps it
is better if I shut up and go away for another 6 months. At the moment
your health and good nature are more important than my opinions about
the language, which I won't (and can't) change just because they upset
you. But if I *had* been actively participating the last several
months, it seems likely you would have blown a gasket, and I don't want
that to happen.

lojbab



posts: 14214

I want to note that I've posted this all before, several months ago.

On Fri, Feb 25, 2005 at 03:06:05PM -0500, John Cowan wrote:
> What we need now is a big test bed of sequences of valid Lojban
> words with stress and pause properly marked.

My morphological testing suite is 206, 677 words long, or 39, 368
lines.

You thought, perhaps, that there wasn't one?

Of those, the first, umm, 5K lines or so all have stress indicated;
the rest is my usual test corpus.

The test suite:

http://teddyb.org/~rlpowell/hobbies/lojban/grammar/morph_test_sentences.txt

The output is currently generated by comparing camxes and valfendi
and printing out only those cases where they disagree.

Many lines were commented out; this is because xorxes believed that
those lines indicated known disagreements between camxes and
valfendi, and hence were "noise" in the output.

I am currently running a test with all of the commented lines
removed. The results are being sent to:

http://teddyb.org/~rlpowell/hobbies/lojban/grammar/rats/morph_out.txt

This testing method assumes that anything that valfending and camxes
both agree on is OK.

Oh, and the output is normalized to what valfendi produces, which
is very idiosyncratic. brivla are in parens, cmene end in ., cmave
are prepended with -, and non-Lojban has >..< around it.

-Robin


posts: 14214

On Fri, Feb 25, 2005 at 04:28:12PM -0500, Bob LeChevalier wrote:
>Robin Lee Powell wrote:
>>The long answer will have to wait; I am so mad at Bob and Nora
>>that I'm literally shaking as I type.
>
>If every time I decide to participate I get you that angry,
>perhaps it is better if I shut up and go away for another 6
>months.

....

That would not be my preference, no.

>But if I *had* been actively participating the last several
>months, it seems likely you would have blown a gasket, and I don't
>want that to happen.

Everything that you said that upset me is a *direct* result of your
lack of participation. You've got it backwards: the less you
participate, the more your participation after an absence upsets me.

This actually goes *in* *general*: I tend to be upset if people are
going to wade in to something we've been working on for months and
start screaming bloody murder; i'm more inclined to listen if they
have done something for the project lately, and/or have concrete
solutions to concrete problems.

Longer response to other mail coming later this afternoon, I hope.

-Robin







posts: 14214

On Thu, Feb 24, 2005 at 08:56:56PM -0500, Robert LeChevalier wrote:
> Robin Lee Powell wrote:
> >On Thu, Feb 24, 2005 at 07:13:16PM -0500, Robert LeChevalier
> >wrote:
> >
> >>Jorge Llamb?as wrote:
> >>
> >>>But it doesn't matter. This has nothing to do with morphology,
> >>>and we've already discussed and voted for the definition of
> >>>{la'o}. If people want to change it, they should ask to re-open
> >>>that section.
> >>
> >>I think this gets to the key point. I reject that any topic
> >>should be considered "closed" if and when it comes up for debate
> >>in later discussing some other topic,
> >
> >snip unnecessary verbosity
> >
> >It would be nice if, at some point, you read the BPFK procedures,
> >as well as the checkpoint page.
>
> I've read them several times, though not recently.

I ask that you read them again, as the last change was to the BPFK
Procedures page was Sun 14 of Mar, 2004 09:12 UTC. Specifically,
the last change to the section in question (see below) was Thu 23 of
Oct, 2003 23:22 UTC.

> Finding anything on the tiki when I am responding to email in
> realtime isn't practical.

It seems to be practical for everyone else. Is there anything I can
help you with that might make it better for you? If the issue is
things being very slow, you might want to consider getting a new
computer. I recall yours as being quite old, and that was two years
ago. Computers are extremely cheap these days.

> >In this case the thing you're missing (among others) is on
> >http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Checkpoints
> >
> > Far future: Pre-Rump Mega-Vote
>
> That's the point. The issue has come up NOW, in a certain context
> that will likely go away whenever discussion of fu'ivla in the
> context of the morphology algorithm ends.

Yup.

> In the far future, it is unlikely that anyone will even remember
> that this discussion took place.

It's possible, however you may want to note the comments at the top
of pages like
http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+Quotations
which are designed to handle exactly this sort of problem.

> If you reread the first line above, Jorge said we should ask to
> reopen la'o, implying that he also thought it was possible to do
> so. I see no procedure prior to that "far future" event for
> reopening anything.

Because you haven't read the procedures lately, let me direct you
to the relevant comment in:

http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Procedure

which is:

Please note that a particular section can be opened more than
once. In particular, a future checkpoint can re-open a section
if a problem with the previously approved proposal is
discovered.

> In addition, I'm not sure that there is even a byfy "checkpoint"
> section on the chapter of the morphology that discusses the
> progression of Type I-IV borrowings, which is what the issue
> really is.

BPFK checkpoints are, in general, against sets of cmavo. There are
special cases for morphology, gismu in general, and miscellaneous
issues.

> Originally none of the morphology was part of the checkpoint
> system but was a separate subcommittee under Nora.

Because she hadn't been able to work on that subcommittee for
something like three years, we moved that work into the checkpoint
system.

I got little response to my emails to her about this issue, and both
of you ignored my mail of last week asking for a response to that
fact.

> Then out of the blue

This has certainly not been out of the blue. We've been discussing
this extensively for months on wikidiscuss. xorxes and Pierre have
been working their *asses* off on this.

In the past three years, and most especially in the last six months
or so, work on this committee has proceeded this far without input,
positive or negative, from Nora, at least that I'm aware of. it's
probably too late to rein in the progress. While more input would
be appreciated, the hard work done so far should not be negated.

> you gave us 2 weeks notice for a vote

As I've said several times in wikidiscuss, I thought we were done
because the people who were actually doing the work had mostly
stopped arguing.

> on the PEG morphology algorithm, which raises issues about the
> morphology as a whole,

The intention is that the PEG algorithm *is* the morphology. I
thought this was quite clear.

We need a formalized morphology, in the same way we have a
formalized grammar. Everyone who's studied both agrees that PEG is
much better than YACC for that purpose.

If you have a problem with PEG, let's trot that out and discuss it.

> There is no checkpoint on the morphology rules themselves, only on
> the PEG algorithm (or am I mistaken).

The whole point here is that the PEG algorithm *IS* the morphology
rules, in exactly the same way that the YACC algorithm *is* the
grammar rules.

> The morphology algorithm must follow from the morphology rules,
> and it seems like the reverse is taking place, such that the rules
> end up being whatever the PEG algorithm determines them to be,
> which is ass-backwards.

You yourself have said, many many times, that the YACC is the most
authoritative source of the Lojban grammar. You never say that the
YACC must subordinate to "the grammar rules", whatever those might
be. How is this different?

-Robin


On Friday 25 February 2005 14:54, Jorge "Llambías" wrote:
> This is John's beautiful definition of syllables (with a couple
> of small adjustments as approved by him):
>
> - A nucleus is a single V, or y, or ai/ei/oi/au, or a syllabic consonant.
> - An onset is an initial-cluster, or a single consonant, or h,
> or glide i/u, or zero.
> - A coda is a single consonant or zero
>
> - An initial syllable may have a consonantal or glide or zero onset,
> but not h.
> - A non-initial syllable may have a consonantal or glide or h onset,
> but not zero.
> - A syllabic consonant may be preceded only by a single-C onset, and
> followed only by a zero coda.
> - A syllable with ai/ei/oi/au nucleus and zero coda cannot be followed
> by a syllable with glide onset.

Can a syllable have both a consonantal and a glide onset, such as {cionmau}?
What about doubly-closed syllables, such as {tarksako}?

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine


posts: 1912


> On Friday 25 February 2005 14:54, Jorge "Llambías" wrote:
> > This is John's beautiful definition of syllables (with a couple
> > of small adjustments as approved by him):
> >
> > - A nucleus is a single V, or y, or ai/ei/oi/au, or a syllabic consonant.
> > - An onset is an initial-cluster, or a single consonant, or h,
> > or glide i/u, or zero.
> > - A coda is a single consonant or zero
> >
> > - An initial syllable may have a consonantal or glide or zero onset,
> > but not h.
> > - A non-initial syllable may have a consonantal or glide or h onset,
> > but not zero.
> > - A syllabic consonant may be preceded only by a single-C onset, and
> > followed only by a zero coda.
> > - A syllable with ai/ei/oi/au nucleus and zero coda cannot be followed
> > by a syllable with glide onset.
>
> Can a syllable have both a consonantal and a glide onset, such as {cionmau}?

According to these rules, no. The glide onset, like the h onset, cannot
combine with anything else. It is an open question whether a glide onset
can follow a syllable with coda: {pric,ion,mau}?

> What about doubly-closed syllables, such as {tarksako}?

With these rules, it would be disallowed, along with {fasxolarkto}.
The idea is to remain very close to lujvo phonotactics, although
consonantal syllables remain the big exception, of course.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


On Wednesday 23 February 2005 10:49, Jorge "Llambías" wrote:
> This also causes doubts with words like {trio}, which would
> be a valid fu'ivla if it had two syllables {tri,o}.

The way I handle that is: remove the comma, decapitalize all vowels, insert
commas between non-diphthong vowels, and if it's invalid, so was the
original. So {trio} is invalid and {trae} is valid (same as {tra,e}).

phma
--
My monthly periods happen once per year.
-Les Perles de la médecine


posts: 1912


> --- Pierre Abbat wrote:
> > Can a syllable have both a consonantal and a glide onset, such as
> {cionmau}?
>
> According to these rules, no. The glide onset, like the h onset, cannot
> combine with anything else. It is an open question whether a glide onset
> can follow a syllable with coda: {pric,ion,mau}?

I forgot to mention that the workaround in these cases would be
to add an h or another vowel: {ci,'on,mau} or {ci,ion,mau}.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 1912


> --- Jorge Llambías wrote:
> > --- Pierre Abbat wrote:
> > > Can a syllable have both a consonantal and a glide onset, such as
> > {cionmau}?
> >
> > According to these rules, no. The glide onset, like the h onset, cannot
> > combine with anything else. It is an open question whether a glide onset
> > can follow a syllable with coda: {pric,ion,mau}?
>
> I forgot to mention that the workaround in these cases would be
> to add an h or another vowel: {ci,'on,mau} or {ci,ion,mau}.

Although if glides are accepted as initials without a glottal stop,
that last one is actually {ci ionmau}.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250


On Thursday 24 February 2005 13:08, John Cowan wrote:
> Well, so he is, interplanetary convention or not. "Behind the desk
> there is a Homo sapiens, a Rattus norvegicus, and innumerable instances
> of Staphylococcus aureus." What is clear from this sentence is that
> "me lao ly. Homo sapiens .ly." does *not* mean "is named 'Homo sapiens'".

In Latin, a scientific name is a noun phrase and can be pluralized and
declined: "Ego sum Homo sapiens et habeo tres Rattos norvegicos." In Lojban
it names a species and goes in x2 of {danlu} and similar words, and it is not
declined, since Lojban lacks case endings and grammatical number. {mi me la'o
ly. Homo sapiens .ly} is loose usage, which is tolerable since names can be
polysemous.

phma
--
..i le babzba ba zbasu
lo jbazbabu lo babjba


posts: 149

Jorge Llamb?as scripsit:

> > Can a syllable have both a consonantal and a glide onset, such as {cionmau}?
>
> According to these rules, no. The glide onset, like the h onset, cannot
> combine with anything else. It is an open question whether a glide onset
> can follow a syllable with coda: {pric,ion,mau}?

I'm okay with the latter.

> > What about doubly-closed syllables, such as {tarksako}?
>
> With these rules, it would be disallowed, along with {fasxolarkto}.
> The idea is to remain very close to lujvo phonotactics, although
> consonantal syllables remain the big exception, of course.

Ah, I hadn't thought about cases like that. But I think they should
be forbidden, for the reason given.

--
"As we all know, civil libertarians are not John Cowan
the friskiest group around — comes from cowan@ccil.org
forever being on the qui vive for the sound http://www.ccil.org/~cowan
of jack-booted fascism coming down the pike." --Molly Ivins


posts: 162

Robin Lee Powell wrote:
>>Finding anything on the tiki when I am responding to email in
>>realtime isn't practical.
>
> It seems to be practical for everyone else. Is there anything I can
> help you with that might make it better for you?

I doubt it. My only wish list has been that when Jorge (or another
shepherd) changes his proposal for the umpteenth time, the message would
include a diff listing in addition to the new page in the message text.
I get "permission denied you cannot view this page listing" when I
click on the diff link (it might be wanting me to login, but since I'm
clicking from email and not planning to post, I don't have any reason to
login).

If the issue is
> things being very slow, you might want to consider getting a new
> computer. I recall yours as being quite old, and that was two years
> ago. Computers are extremely cheap these days.

So am I (cheap, that is). You'll recall that we have high expenses,
with higher ones coming up as a second kid approaches college.

I think it really is that I am a one-track mind sort of person, and have
never worked well in a windowed environment. When I read email, I read
email, and I generally don't look things up online unless they have a
URL in the email (I do look things up in a red book when discussing
that, and I use old MS-DOS List to call up a gismu or cmavo list once in
a while). When I web-surf looking stuff up, I get too involved and lose
track of what I was doing, and so the email never gets finished or ends
up more longwinded and disjoint that is usual even for me.

And as I've said before, I simply have never worked that well on the
tiki interface, and have never worked on any other online forum at all.
I simply cannot seem to make the adjustment to my style of working.
tiki participation seems to me like the sort of thing that would require
me to spend 2-3 times as much time as I currently spend, in order to be
proficient with the interface and familiar enough with the material
being posted by others that I can interact as an equal. I don't have
that much commitment these days, even when I have the time.

>>In the far future, it is unlikely that anyone will even remember
>>that this discussion took place.
>
> It's possible, however you may want to note the comments at the top
> of pages like
> http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+Quotations
> which are designed to handle exactly this sort of problem.

That's nice, and if I ever learn to use the tiki again, I'll try to
figure out how I'm supposed to get one of these comments in - I thought
that voted-on sections are locked and placed in an archive, and even
when not locked can only be edited by the shepherd or by you. I thought
that as a non-shepherd, I'm pretty much limited to voting and to the
discussions that I never can keep up with. I see nothing in the
procedures that tells me how those comments get added (and I just
rechecked).

And, unfortunately, by enabling me to read and respond by email, I now
almost never actually look at a wiki site directly, and will thus have
to relearn the interface (I don't even remember how to log in - and I
suspect that stuff I've disabled for security reasons like scripts or
activex are preventing me from seeing anything other than a blue box
labeled "login" in the upper right).

I don't learn new programs quickly and easily any more, I forget what I
don't use, and my inlaws have convinced me to be sufficiently
security-paranoid that I have to manually activate functions in order to
do anything but read web pages.

I understand that this is my problem and not yours, but it is a problem,
and I don't think it likely to be easily solved.

>>If you reread the first line above, Jorge said we should ask to
>>reopen la'o, implying that he also thought it was possible to do
>>so. I see no procedure prior to that "far future" event for
>>reopening anything.
>
> Because you haven't read the procedures lately, let me direct you
> to the relevant comment in:
>
> http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Procedure
>
> which is:
>
> Please note that a particular section can be opened more than
> once. In particular, a future checkpoint can re-open a section
> if a problem with the previously approved proposal is
> discovered.

I get "Sorry, "BPFK Procedure" has not been created". But I was smart
enough to look from there and find it should have been "BPFK Procedures".

So having looked and confirmed that the above is all that that page says
on the matter, my stupid question of the moment is "what is the
procedure for deciding that a previously approved proposal has a problem
which warrants reopening the page?"

>>In addition, I'm not sure that there is even a byfy "checkpoint"
>>section on the chapter of the morphology that discusses the
>>progression of Type I-IV borrowings, which is what the issue
>>really is.
>
> BPFK checkpoints are, in general, against sets of cmavo. There are
> special cases for morphology, gismu in general, and miscellaneous
> issues.
>
>>Originally none of the morphology was part of the checkpoint
>>system but was a separate subcommittee under Nora.
>
> Because she hadn't been able to work on that subcommittee for
> something like three years, we moved that work into the checkpoint
> system.
>
> I got little response to my emails to her about this issue, and both
> of you ignored my mail of last week asking for a response to that
> fact.

Please identify the email more specifically; I don't recall seeing any
specifically asking Nora or me anything. Nora normally reads email once
a week, and has been skipping all byfy traffic because you told her that
she doesn't need to read the discussions - only the proposals.

I knew that you had been working on the PEG Morphology algorithm last
fall, but my understanding was that this was just another project and
not part of the byfy - we ignored it just like we've ignored valfendi -
it was not official. My first awareness that the PEG morphology was to
be considered for byfy was your announcement on 2/11 giving us 2 weeks
before a vote. I told Nora, and she printed out the proposal and has
been trying to figure it out ever since. She responded to that message
on 2/14. Looking into my archives, I see that you mentioned it also in
the announcement of 2/8 at the end of something about Magic Words
being partially closed (whatever that means, since I wasn't even trying
to follow what Magic Words meant). My email subject directory has
"bpfk-announce Partial closure of Magic Words; Mor" so I did not even
realize it mentioned morphology. 99% of byfy traffic has been put into
the "I hope some year I have time to read some of this" bin, so I never
even looked at that message until now.

(I actually do this with all tiki traffic, since I haven't got the skill
with filtering to separate tiki stuff from byfy stuff. Most of my email
reading is still tossing out hundreds of spam messages, and then
skimming and binning whatever is left - you should know by now that I do
respond to your emails directly to me, and that if I don't respond, it
is probably because I somehow did not see it.)

>>Then out of the blue
>
> This has certainly not been out of the blue. We've been discussing
> this extensively for months on wikidiscuss.

So what? When did something being discussed on the general wiki become
a byfy proposal? So far as I know, only on 2/11 when you told us it had.

I've now searched all my Lojban related email for mention of Nora's name
or the word "morphology". The prior mention of Nora by you seems to
have been compliments of her Noralujv work on 1/6/05.

> xorxes and Pierre have been working their *asses* off on this.

I appreciate their effort. But since Nora and Pierre were kinda at
loggerheads specifically over the issue of designing an algorithm before
determining the specification for that algorithm (i.e. figuring out what
CLL said and how to resolve inconsistencies), it wasn't of byfy interest
until you made it so. The PEG algorithm is written like EBNF, only
with even more special cases. I've never been able to use the EBNF for
analysis myself, only the YACC grammar, in part because the flexibility
of the notation requires me to manually expand it before I can even see
what it allows. Thus the PEG spec is useless to me. I need the rules
that I as a Lojbanist needs to follow when speaking (and for fu'ivla,
when word-making). Cowan I think has explained the difference far
better than I could in a recent post.

> In the past three years, and most especially in the last six months
> or so, work on this committee has proceeded this far without input,

There has been no "committee" that Nora knew of - just her. There was
no interest except from Pierre and he started off talking about doing
something totally orthogonal to what Nora wanted to do, based on
valfendi, so that never went beyond the first message a couple years
ago. Once byfy work started in earnest under your jatnaship, her very
limited Lojban time went towards vainly trying to keep up with some of
the other byfy work, so of course she wasn't going to be doing anything
on some other topic.

> positive or negative, from Nora, at least that I'm aware of. it's
> probably too late to rein in the progress. While more input would
> be appreciated, the hard work done so far should not be negated.

I don't think Nora is seeking that. She's been pretty clear as to what
she is seeking, and I think Cowan and Jorge both understand to some
extent, even if you don't.

>>you gave us 2 weeks notice for a vote
>
> As I've said several times in wikidiscuss, I thought we were done
> because the people who were actually doing the work had mostly
> stopped arguing.

It was never moved from the general wiki discussion into something that
the byfy should take notice of until 2/8 or 2/11.

>>on the PEG morphology algorithm, which raises issues about the
>>morphology as a whole,
>
> The intention is that the PEG algorithm *is* the morphology. I
> thought this was quite clear.

The current baseline description of the phonology and morphology is
Chapters 2 and 3 of CLL. I would expect that a replacement baseline for
what is in CLL would be written in the sort of language that CLL is
written in, not in something EBNFish.

> We need a formalized morphology, in the same way we have a
> formalized grammar.

I'll accept your ruling to that effect, but it was never voted thusly by
byfy. There is a formalized grammar, but there is also 600-odd pages of
CLL explaining that grammar, and the explanation is at least as
important a part of the grammar baseline as the YACC specification.

>Everyone who's studied both agrees that PEG is
> much better than YACC for that purpose.
>
> If you have a problem with PEG, let's trot that out and discuss it.

Since I have no idea what the differences and similarities are, its a
bit late for that. As I said above, the PEG stuff on the morphology
looks more like EBNF to me than like YACC, and I can't even read it as
it is. I have to translate it manually into a YACC-like format with one
rule per line to even read it.

YACC was used for Loglan grammar specification because YACC
implementations verified that it was LALR-1, and provided an annotated
listing of a proposed grammar for any failures to be LALR-1. That was
the sole reason why it served as a specification tool. Prior to YACC,
JCB defined the language using a manually parsed corpus. There was a
formalization of that corpus, but the parsed corpus was the standard and
not the formalization, until he was convinced that the YACC was both
unambiguous and could produce the prespecified corpus parsing (this took
around 6 years, from 1976 to 1982 - it wasn't trivial)

I've never seen a similar tool to a YACC program for verifying the
unambiguity of an EBNF to some standard (it needn't be LALR-1).
Similarly, there have been non-YACC parsers written for Lojban, the one
I know best being Jeff Prothero's recursive descent parser. From what
little I understand technically about parsers, proving that any given
parser is the equivalent of the YACC grammar and associated parser is
non-trivial. That is why Cowan's official parser with its hand-coding
of the lexer rules, seemed to be broken a lot of the time at that lower
level, and I have the impression that jbofi'e had occasional differences
from Cowan's parser, though I've ignored jbofi'e and can't be sure.

(I have just now stopped writing and spent a half hour looking up PEG
grammar on the web, so I know that it is claimed to be unambiguous
because of prioritization rules; I don't know how this is proven, but
I'll take their word for it. That means that what remains is to prove
that whatever the PEG grammar produces happens to be the breakdown that
a human being thinks it should be.)

>>There is no checkpoint on the morphology rules themselves, only on
>>the PEG algorithm (or am I mistaken).
>
> The whole point here is that the PEG algorithm *IS* the morphology
> rules, in exactly the same way that the YACC algorithm *is* the
> grammar rules.

I think I've addressed this above. We cannot know if they are the rules
(other than by fiat) without what JCB called 'verifying that the
"machine grammar" matches the "human grammar"'. The human morpher has
to be the standard UNTIL we approve a machine morpher that fits it.
Thereafter, we can use the machine morpher as the standard with some trust.

>>The morphology algorithm must follow from the morphology rules,
>>and it seems like the reverse is taking place, such that the rules
>>end up being whatever the PEG algorithm determines them to be,
>>which is ass-backwards.
>
> You yourself have said, many many times, that the YACC is the most
> authoritative source of the Lojban grammar. You never say that the
> YACC must subordinate to "the grammar rules", whatever those might
> be. How is this different?

The YACC rules were not invented out of whole cloth, but had to be made
to fit a language that already existed. Once they were made to fit, we
could baseline them and then expect to make later text descriptions
match the machine description. The morphology was developed in the
reverse order, and the text descriptions were developed and made into
the baseline. Formalizing those descriptions would be a good thing, I
agree. But in fact because of sloppiness in how we wrote up stuff on
fu'ivla and to a lesser extent on names and experimental cmavo, what is
emerging in discussion is that what Jorge and Pierre did was NOT
formalizing the existing CLL text, but creating a formalization which
was based on the CLL, but which incorporates an unknown number of
unspecified decisions resolving issues in that CLL text, and in some
cases making changes to what is stated in the CLL text.

Now you will recall that I have all along wanted to vote on each change
as a *change* to a known status quo (with my well-known prejudice
against any change not adequately justified as needing a change), and
not be voting on a package deal which I am assured by the shepherd
describes the status quo, but in fact has in every case incorporated
some number of unspecified changes which can only be determined by
careful analysis of the sort that neither Nora nor I do well under time
pressure. We thus spend all our time with each proposal trying to
figure out what undiscussed changes have been slipped in that are
reasons to vote "no" unless they are backed out because you require that
sort of specificity before we can vote "no".

I've accepted that you will continue this way. I have no desire to
replace you, since something is getting done at least. But I'm not in
the least happy with the lack of conservatism in the byfy decisions thus
far, so I will continue to protest. You are of course free to continue
to overrule my protests. But if my protests get you so steamed up that
you cannot effectively function, then I shall have to cease protesting,
but that also means ceasing any efforts to EVER participate in the byfy
work, because I've come to see that my only useful role right now is as
a dragging anchor fighting the strong tide of change that happens with
Jorge leading the technical work. I wish I saw another viable choice.

(I've toyed with the idea of going off and writing a few thousand words
on Lojban as I speak it as a possible counterweight to xorxes zei bangu
- probably my long procrastinated Arabian Nights translation - but that
would require me to entirely tune out byfy, and I'm sure that everything
I produce will end up garbage once byfy finishes changing the language
beyond my recognition, which is my fear as to what will happen if I do
anything other than what I am doing).

Oh well, sorry for being excessively wordy again. But I am what I am,
just as you are what you are. Hopefully we'll continue to survive each

other %
)


And if this has gotten you too steamed up to reply, I suggest getting
Cowan's input since he gets you less steamed and he seems to understand

Lojbabese %
)


And for all my protests, you still have my vote of confidence.

lojbab



On Friday 25 February 2005 21:53, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > Can a syllable have both a consonantal and a glide onset, such as
> > {cionmau}?
>
> According to these rules, no. The glide onset, like the h onset, cannot
> combine with anything else. It is an open question whether a glide onset
> can follow a syllable with coda: {pric,ion,mau}?
>
> > What about doubly-closed syllables, such as {tarksako}?
>
> With these rules, it would be disallowed, along with {fasxolarkto}.
> The idea is to remain very close to lujvo phonotactics, although
> consonantal syllables remain the big exception, of course.

I'd much rather have Jorge's rules than John's. I think fu'ivla and cmene

  • should* have more permissive phonotactics, since they come from a wide

variety of languages, and have trouble imagining anyone who can pronounce
lujvo without difficulty but has trouble with {malminiata}, {mianma}, or
{mandioka} (all of which - though I can't attest to the Burmese - are taken
directly from natlangs).

phma
--
S Fa1>+/- !TM Ng- M K H T-- t? AT++ SY Te- SC- FO- D P !Tz E++ L
Am I Ha- hc-- FH+++ IP?


posts: 1912


> (I've toyed with the idea of going off and writing a few thousand words
> on Lojban as I speak it as a possible counterweight to xorxes zei bangu
> - probably my long procrastinated Arabian Nights translation

Yes, yes, yes, please do! If not a few thousand words at least some
paragraphs. Or else (or in addition), read (some part of) my translation
of Alice, or (some part of) Robin's "la nicte cadzu", and say how in
your opinion it differs from true Lojban. You might me surprised how
conservative our usage actually is.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


posts: 149

Pierre Abbat scripsit:

> I'd much rather have Jorge's rules than John's.

I think Jorge is adopting John's rules, but I know what you mean.

> I think fu'ivla and cmene
> *should* have more permissive phonotactics, since they come from a wide
> variety of languages,

So they do.

> and have trouble imagining anyone who can pronounce
> lujvo without difficulty but has trouble with {malminiata}, {mianma}, or
> {mandioka}

The question isn't whether you or I could pronounce them. The question
is whether they meet the design standards for the language, especially
unambiguity. Given that {mandiioka} is valid, I don't think {mandioka}
should be.

> (all of which - though I can't attest to the Burmese - are taken
> directly from natlangs).

So is "vprtskvni" (no syllabic consonants), but I don't want it in Lojban.
(I'm pretty sure that "Myanmar" is a mere transliteration, and the
pronunciation remains "b@ma" (of which "Burma" is a British-English-style
spelling), but not 100% sure.)

--
Si hoc legere scis, nimium eruditionis habes.


posts: 1912


> I'd much rather have Jorge's rules than John's. I think fu'ivla and cmene
> *should* have more permissive phonotactics, since they come from a wide
> variety of languages, and have trouble imagining anyone who can pronounce
> lujvo without difficulty but has trouble with {malminiata}, {mianma}, or
> {mandioka} (all of which - though I can't attest to the Burmese - are taken
> directly from natlangs).

I think ease of pronunciation is a red herring. The rules are not meant
to separate what is easy to pronounce from what is difficult, but what
fits within the Lojban phonotactic system and what doesn't. I know that
many of the existing restrictions were justified by a supposed ease of
pronunciation, but in reality, they only reflect ease of pronunciation
for a few people, mostly English speakers though with concesions to
some other languages too.

Fu'ivla and and cmevla should do not, in my opinion, be required to be
an identical calque of the source word. The borrowed words should be
adapted to Lojban's phonotactics, and not the other way around. There
is no doubt that there are millions of words in other languages that
contain consonant+glide+vowel. Spanish is full of them. That in itself
is not a reason to allow them in Lojban.

I am not particularly in favour of imposing that restriction. It can
be justified by pointing out that it does not appear in lujvo, but
there are other things that don't appear in lujvo an we allow (such
as consonantal syllables, which are a much bigger deviation), so I
would not consider this particular extension unreasonable.

It would be nice to hear exactly what the argument against
consonant+glide+vowel syllables is. I've heard for example that
some people have difficulty in pronouncing {cia} or {jia}
distinctly enough from {ca} or {ja}.

If there's a vote, I would vote in favour of allowing C+I+V,
but it's not something that I would miss very much if it is not
allowed. So John and Nora (and probably Lojbab) are clearly against
allowing them, Pierre and I would be in favour of allowing them
(and usage probably too, for example someone has just used the
name {la nikaraguas} in the Spanish lojban list, though strangely
enough spelt {la nikarag,uas} so actually that one is approved
by John too, so that's not an example either way). For now
I'm implementing the restriction. Anyone else has an opinion?

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail


posts: 1912


> Given that {mandiioka} is valid, I don't think {mandioka}
> should be.

I don't think {man,dio,ka} would be much confused with {man,di,io,ka}
since they have a different syllable count. A better contrasting
pair would be {man,di,io,ka} and {man,dii,io,ka}.

mu'o mi'e xorxes





__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 149

Jorge Llamb?as scripsit:

> It would be nice to hear exactly what the argument against
> consonant+glide+vowel syllables is. I've heard for example that
> some people have difficulty in pronouncing {cia} or {jia}
> distinctly enough from {ca} or {ja}.

Historically, (t)c and (d)j in many languages derive from ti and si,
or else from ki and gi. Preventing this kind of drift is important in
Lojban, as we don't want words that are too close together. Similarly,
labials with and without u-glide are confusables in some languages,
notably Chinese, where written "bo po mo fo" are actually pronounced
"buo puo muo fuo" (Chinese lacks "v") and do not contrast with versions
without the glide.

> If there's a vote, I would vote in favour of allowing C+I+V,
> but it's not something that I would miss very much if it is not
> allowed. So John and Nora (and probably Lojbab) are clearly against
> allowing them, Pierre and I would be in favour of allowing them
> (and usage probably too, for example someone has just used the
> name {la nikaraguas} in the Spanish lojban list, though strangely
> enough spelt {la nikarag,uas} so actually that one is approved
> by John too, so that's not an example either way).

I object only to CIV *syllables*, so if the C is the coda of a
previous syllable, there is no problem. In practice this blocks
only initial CIV and XCIV, where X is a non-syllabic consonant.

--
A poetical purist named Cowan that's me: cowan@ccil.org
Once put the rest of us dowan. on xml-dev
"Your verse would be sweeter http://www.ccil.org/~cowan
If it only had metre http://www.reutershealth.com
And rhymes that didn't force me to frowan." overpacked line! --Michael Kay


posts: 14214

On Sat, Feb 26, 2005 at 04:28:12PM -0800, Jorge Llamb?as wrote:
>
> --- Bob LeChevalier wrote:
> > (I've toyed with the idea of going off and writing a few
> > thousand words on Lojban as I speak it as a possible
> > counterweight to xorxes zei bangu - probably my long
> > procrastinated Arabian Nights translation

I woul like that very much.

> Yes, yes, yes, please do! If not a few thousand words at least
> some paragraphs. Or else (or in addition), read (some part of) my
> translation of Alice, or (some part of) Robin's "la nicte cadzu",
> and say how in your opinion it differs from true Lojban. You might
> me surprised how conservative our usage actually is.

I expect so, yes.

-Robin


posts: 1912


> Historically, (t)c and (d)j in many languages derive from ti and si,
> or else from ki and gi. Preventing this kind of drift is important in
> Lojban, as we don't want words that are too close together.

But drifts like that occur with all sounds. Not even the cardinal vowels
are safe, as English shows. The question is, are siV/tiV/diV especially
prone to such drifts, or is it just that it is a very noticeable drift
because it has happened in English and other familiar languages
relatively recently? And if those few are very prone to change, is that
reason enough to do away with all CIV's? (Maybe yes, just for
the sake of simplicity.)

> Similarly,
> labials with and without u-glide are confusables in some languages,
> notably Chinese, where written "bo po mo fo" are actually pronounced
> "buo puo muo fuo" (Chinese lacks "v") and do not contrast with versions
> without the glide.

OK. ("uo" is very rare in Spanish, the only word I can think of
that has it is "quorum", which even has an irregular spelling.)

> I object only to CIV *syllables*, so if the C is the coda of a
> previous syllable, there is no problem. In practice this blocks
> only initial CIV and XCIV, where X is a non-syllabic consonant.

You mean XCIV is blocked when C is a non-syllabic consonant, right?
At the moment consonantal syllables can't take codas.

If {lecio} is phonotactically acceptable, and it doesn't break
into cmavo or lujvo or anything else, does that mean it's a
valid fu'ivla, where -ci- is an impermissible initial cluster?

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250


On Fri, 25 Feb 2005 15:37:30 -0800, Robin Lee Powell
<rlpowell@digitalkingdom.org> wrote:
> This actually goes *in* *general*: I tend to be upset if people are
> going to wade in to something we've been working on for months and
> start screaming bloody murder; i'm more inclined to listen if they
> have done something for the project lately, and/or have concrete
> solutions to concrete problems.

..u'u ru'e .o'a nai ru'e .u'o cu'i .ii cu'i .io .i'o sai

mu'o mi'e .filip.
--

Philip Newton <philip.newton@gmail.com>



On Saturday 26 February 2005 23:27, John Cowan wrote:
> I object only to CIV *syllables*, so if the C is the coda of a
> previous syllable, there is no problem. In practice this blocks
> only initial CIV and XCIV, where X is a non-syllabic consonant.

A word with CIV or even XCIV syllables is too long for me to say without
pausing! ;)

phma
--
..i le babzba ba zbasu
lo jbazbabu lo babjba


On Saturday 26 February 2005 19:58, Jorge "Llambías" wrote:
> It would be nice to hear exactly what the argument against
> consonant+glide+vowel syllables is. I've heard for example that
> some people have difficulty in pronouncing {cia} or {jia}
> distinctly enough from {ca} or {ja}.

Anyone who wishes to avoid getting into deeper trouble with pandas can say
{ci,on,mau}.

phma
--
le xruki le ginxre xrixruba xu xrula cu xrani?


On Saturday 26 February 2005 19:35, John Cowan wrote:
> So is "vprtskvni" (no syllabic consonants), but I don't want it in Lojban.

That contains "vp" and "kv", and none of us is arguing that those should be
allowed.

phma
--
le xruki le ginxre xrixruba xu xrula cu xrani?


posts: 162

Jorge Llambas wrote:
> --- Pierre Abbat wrote:
>>I'd much rather have Jorge's rules than John's. I think fu'ivla and cmene
>>*should* have more permissive phonotactics, since they come from a wide
>>variety of languages, and have trouble imagining anyone who can pronounce
>>lujvo without difficulty but has trouble with {malminiata}, {mianma}, or
>>{mandioka} (all of which - though I can't attest to the Burmese - are taken
>>directly from natlangs).
>
> I think ease of pronunciation is a red herring. The rules are not meant
> to separate what is easy to pronounce from what is difficult, but what
> fits within the Lojban phonotactic system and what doesn't.

The phonotactic system was however designed around a standard of ease of
pronunciation (and listening comprehension).

>I know that
> many of the existing restrictions were justified by a supposed ease of
> pronunciation, but in reality, they only reflect ease of pronunciation
> for a few people, mostly English speakers though with concesions to
> some other languages too.

The original 4 of us included two people skilled in multiple languages,
and their sense of what was "difficult" was considered strongly.

> It would be nice to hear exactly what the argument against
> consonant+glide+vowel syllables is. I've heard for example that
> some people have difficulty in pronouncing {cia} or {jia}
> distinctly enough from {ca} or {ja}.

Not just difficulty of pronunciation, but history of linguistic drift.
Your own name is an excellent example. I believe that the ll in
Castillian is a palatalized l, which in English is "li". But I recall
that Argentinian Spanish does something different with ll, and some
other Spanish dialects do other things, all because of linguistic drift
since colonization of the Americas.

> If there's a vote, I would vote in favour of allowing C+I+V,
> but it's not something that I would miss very much if it is not
> allowed.

The problem is that if we allow it, it has to be allowed for all values
of C or the rules get complicated.

Russian, which has palatalized consonants that are often transliterated
in English with an "i" after the consonant, has some consonants that are
not palatalized. Spanish has only n and l palatalized, though perhaps
there are other consonants that are followed by diphthong "iV". We
wanted things simple enough so that there would be no such exceptions.

For Lojban:
Instead of using ia, use ii for stronger problems:
cii sii ci
jii zii ji
tii tci
are possible confusions
but there is also the plain difficulty of even pronouncing
lii and rii as single syllables

Similarly with u glide, make it a ui or uu diphthong and the
pronunciation problems stand out more.
buu I can say but if I say it a lot and fast it ends up as a long bu
(English boo). Similarly with vuu and fuu and muu and nuu and puu.
I can with practice force out lui as one syllable, but without care, it
always comes out as two syllables. luu and rui and ruu I simply cannot
say as a single syllable.

>So John and Nora (and probably Lojbab) are clearly against
> allowing them, Pierre and I would be in favour of allowing them
> (and usage probably too, for example someone has just used the
> name {la nikaraguas} in the Spanish lojban list, though strangely
> enough spelt {la nikarag,uas} so actually that one is approved
> by John too, so that's not an example either way).

I can pronounce nikaraguas with either syllabification, but I cannot
pronounce nikagaruas without separating the r and u into separate
syllables. The Lojban rules however would also have to allow
ruujiigiilias as well, and that's a little harder to say clearly.

lojbab




posts: 14214

On Sun, Feb 27, 2005 at 07:35:48PM +0100, Philip Newton wrote:
> On Fri, 25 Feb 2005 15:37:30 -0800, Robin Lee Powell
> <rlpowell@digitalkingdom.org> wrote:
> > This actually goes *in* *general*: I tend to be upset if people
> > are going to wade in to something we've been working on for
> > months and start screaming bloody murder; i'm more inclined to
> > listen if they have done something for the project lately,
> > and/or have concrete solutions to concrete problems.
>
> .u'u ru'e .o'a nai ru'e .u'o cu'i .ii cu'i .io .i'o sai

ki'e filip .i mi xenru lo nu mi pu kusru cusku do to'e mu'i lo nu do
na cmima

(Some of us have postulated "to'e <causal bai>" for "despite")

-Robin


posts: 2388



> Jorge Llambías wrote:
> > --- Pierre Abbat wrote:
> >>I'd much rather have Jorge's rules than
> John's. I think fu'ivla and cmene
> >>*should* have more permissive phonotactics,
> since they come from a wide
> >>variety of languages, and have trouble
> imagining anyone who can pronounce
> >>lujvo without difficulty but has trouble with
> {malminiata}, {mianma}, or
> >>{mandioka} (all of which - though I can't
> attest to the Burmese - are taken
> >>directly from natlangs).
> >
> > I think ease of pronunciation is a red
> herring. The rules are not meant
> > to separate what is easy to pronounce from
> what is difficult, but what
> > fits within the Lojban phonotactic system and
> what doesn't.
>
> The phonotactic system was however designed
> around a standard of ease of
> pronunciation (and listening comprehension).

I think the parenthesized bit need to come out
and go to the front. Ease of pronunciation is
subjective and variable; what can be heard
clearly in a noisy environment is at least
considerably less so (I thin there are in fact a
mess of studies somewhere — Ladefoged and
Fromkin, maybe — about what combos can and
cannot be distinguished in various bad
environments. Lojban has low redundancy for a
language and so anything that increases distance
between possible competing items or that gives
extra clues to items is greatly to be desired.
Part of the reason for the rigorous limits on
native predicates is to give a running clue to
what is happening, so deviating too far from that
norm deprives us of thoise clues and should be
avoided.

> >I know that
> > many of the existing restrictions were
> justified by a supposed ease of
> > pronunciation, but in reality, they only
> reflect ease of pronunciation
> > for a few people, mostly English speakers
> though with concesions to
> > some other languages too.
>
> The original 4 of us included two people
> skilled in multiple languages,
> and their sense of what was "difficult" was
> considered strongly.

And yet you came up with lists that others with
similar language experience find odd: "easy"
patterns are rejected, "hard" ones are featured.
Some thing a little more reliabe shp\ould derve
as base (as much as possible this late in the
game). I don't know of any objective standards
for ease or difficulty of pronunciation (though
some could probably be inferred from the muscle
activity involved in each case (and, more
remotely, from the commonness of the sound). Why
I suggest that you take your stand on
intelligibity under adverse conditions.

> > It would be nice to hear exactly what the
> argument against
> > consonant+glide+vowel syllables is. I've
> heard for example that
> > some people have difficulty in pronouncing
> {cia} or {jia}
> > distinctly enough from {ca} or {ja}.
>
> Not just difficulty of pronunciation, but
> history of linguistic drift.
> Your own name is an excellent example. I
> believe that the ll in
> Castillian is a palatalized l, which in English
> is "li". But I recall
> that Argentinian Spanish does something
> different with ll, and some
> other Spanish dialects do other things, all
> because of linguistic drift
> since colonization of the Americas.

Linguistic drift is pretty uninteresting; the
move from long l to y in Spanish took from before
Spanish (i.e., in Western Latin) to the 18th
century. Not much of a problem for Lojban yet
and one that can be resisted by various means.
More important is the way that pronunciation gets
slopped today under conditions of carelessness
(speed, indifference, ...) and there you want at
least that different forms slop in different ways
rather than slopping to some common place. We
have no data on sloppy Lojban (or damned little)
so we don't know directly what to guard against,
but we can make some educated guesses (and did, I
suppose) on the basis of the speakers' native
languages (English initially). Indeed, the
restrictions that have most attracted attention
seems to be justified mainly by the likelihood
that they will fall together with other forms --
as witness their behavior in English casual
speech.

> > If there's a vote, I would vote in favour of
> allowing C+I+V,
> > but it's not something that I would miss very
> much if it is not
> > allowed.
>
> The problem is that if we allow it, it has to
> be allowed for all values
> of C or the rules get complicated.
>
> Russian, which has palatalized consonants that
> are often transliterated
> in English with an "i" after the consonant, has
> some consonants that are
> not palatalized. Spanish has only n and l
> palatalized, though perhaps
> there are other consonants that are followed by
> diphthong "iV". We
> wanted things simple enough so that there would
> be no such exceptions.
>
> For Lojban:
> Instead of using ia, use ii for stronger
> problems:
> cii sii ci
> jii zii ji
> tii tci
> are possible confusions
> but there is also the plain difficulty of even
> pronouncing
> lii and rii as single syllables

What seems to happen with the last pair is that,
at worst when these get pronounced at all by
people form whom they are not familiar with them
syllabify the r or l, which ought not screw up
any rules at all.

> Similarly with u glide, make it a ui or uu
> diphthong and the
> pronunciation problems stand out more.
> buu I can say but if I say it a lot and fast it
> ends up as a long bu
> (English boo). Similarly with vuu and fuu and
> muu and nuu and puu.
> I can with practice force out lui as one
> syllable, but without care, it
> always comes out as two syllables. luu and rui
> and ruu I simply cannot
> say as a single syllable.

This is starting to sound like rules against
mainly double vowels, ii and uu, and even those
are not base necessarily on possible confusions
(though I think the case can be made) but on
troubles wrapping one's mouth around them (I
assume that no one has trouble with Buena Vista
or bwana). And there is always
resyllabification, which is supposed not to
change things. Or is /ei/ different from /e,i/
(the stuff about commas is not too clear in
places).

> >So John and Nora (and probably Lojbab) are
> clearly against
> > allowing them, Pierre and I would be in
> favour of allowing them
> > (and usage probably too, for example someone
> has just used the
> > name {la nikaraguas} in the Spanish lojban
> list, though strangely
> > enough spelt {la nikarag,uas} so actually
> that one is approved
> > by John too, so that's not an example either
> way).
>
> I can pronounce nikaraguas with either
> syllabification, but I cannot
> pronounce nikagaruas without separating the r
> and u into separate
> syllables. The Lojban rules however would also
> have to allow
> ruujiigiilias as well, and that's a little
> harder to say clearly.
>
> lojbab
>
>
>
>
>



posts: 1912


> Jorge Llambías wrote:
> > It would be nice to hear exactly what the argument against
> > consonant+glide+vowel syllables is. I've heard for example that
> > some people have difficulty in pronouncing {cia} or {jia}
> > distinctly enough from {ca} or {ja}.
>
> Not just difficulty of pronunciation, but history of linguistic drift.
> Your own name is an excellent example. I believe that the ll in
> Castillian is a palatalized l, which in English is "li". But I recall
> that Argentinian Spanish does something different with ll, and some
> other Spanish dialects do other things, all because of linguistic drift
> since colonization of the Americas.

I'm not sure what your point is here. In Spanish palatalized l,
in those dialects that have it, contrasts with l plus diphthong.
For example "hallar" and "aliar" are different by exuctly just
that feature. So this would actually be an argument in favour
of allowing CIV.

> > If there's a vote, I would vote in favour of allowing C+I+V,
> > but it's not something that I would miss very much if it is not
> > allowed.
>
> The problem is that if we allow it, it has to be allowed for all values
> of C or the rules get complicated.

The rules are already extremely complicated. Any way we do this the
rules will be complicated. I wouldn't exclude any consonants though.
What I would perhaps exclude is Cii and Cuu.

> Russian, which has palatalized consonants that are often transliterated
> in English with an "i" after the consonant, has some consonants that are
> not palatalized. Spanish has only n and l palatalized, though perhaps
> there are other consonants that are followed by diphthong "iV".

Non-palatalized n and l too can be followed by iV in Spanish:
"aliado", "nieto".

> For Lojban:
> Instead of using ia, use ii for stronger problems:
> cii sii ci
> jii zii ji
> tii tci
> are possible confusions
> but there is also the plain difficulty of even pronouncing
> lii and rii as single syllables

Yes, I would favour excluding Cii and Cuu.

If we forbid all CiV, what happens with rather entrenched names
like {nitcion}?

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


Jorge Llamb��)B�as scripsit:

> > Russian, which has palatalized consonants that are often transliterated
> > in English with an "i" after the consonant, has some consonants that are
> > not palatalized. Spanish has only n and l palatalized, though perhaps
> > there are other consonants that are followed by diphthong "iV".
>
> Non-palatalized n and l too can be followed by iV in Spanish:
> "aliado", "nieto".

In details, Russian basically has a four-way contrast between /pa/, /p;a/,
/p;ja/, and /pja/ (where ; is a palatalization diacritic). The fourth
form is unstable and often pronounced like the third form. For a few
consonants, however, only palatalized or only non-palatalized forms exist.

--
John Cowan <jcowan@reutershealth.com> http://www.reutershealth.com
I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith. --Galadriel, LOTR:FOTR


posts: 162

Jorge Llambas wrote:
> If we forbid all CiV, what happens with rather entrenched names
> like {nitcion}?

I pronounce that name with three syllables "nitci,on" and always have.

lojbab




posts: 14214

On Mon, Feb 28, 2005 at 03:22:25PM -0500, Bob LeChevalier wrote:
> Jorge Llamb?as wrote:
> >If we forbid all CiV, what happens with rather entrenched names
> >like {nitcion}?
>
> I pronounce that name with three syllables "nitci,on" and always
> have.

nit,cion here.

-Robin


posts: 1912



> Jorge Llambías wrote:
> > If we forbid all CiV, what happens with rather entrenched names
> > like {nitcion}?
>
> I pronounce that name with three syllables "nitci,on" and always have.

So you would allow it as three syllables? (Nick pronounces it
with two syllables, BTW.)

In the case of names, number of syllables is irrelevant,
but in the case of fu'ivla it is relevant. If {CiV} is
allowed and it counts as two syllables, then {tci,o} is a
valid fu'ivla. If it is allowed, but counts as one syllable, it
is not a valid fu'ivla.

I think there are far too many names in use with CiV to
disallow it. I don't see any problem in allowing the
pronunciation as two syllables, as long as the "glide syllable"
does not count, just like y-syllables, consonantal syllables
and buffer-vowel syllables don't count, for penultimate
stress purposes. Would that be an acceptable compromise?

mu'o mi'e xorxes




__
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail


posts: 2388



> Jorge Llambías wrote:
> > If we forbid all CiV, what happens with
> rather entrenched names
> > like {nitcion}?
>
> I pronounce that name with three syllables
> "nitci,on" and always have.
>

I am really rusty on all this, but this remark
seems to imply that it is possible in some cases
to have two successive syllables with nothing
between their vowels. Is this merely a variant
(not quite official but a concession to human
frailty) for some diphthongs in names or is it 1)
a possibility (indeed the only possibility) for
nondiphthongs lie /ea/ or/ao/. Is it only in
names or might fuhivla develop such have such
forms? CLL seems to say that /'/ has always to
go between two vowels in separate syllables but
then turns around and gives at least one example
of not: /meiin/ which then gets two different
possible (phonological) interpretations /mei,in/
(default) and /me,iin/ which requires a comma.
Is a puzzlement whivch, I assume, has been
resolved somewhere but I can't find it — except
to see several cases in discussions which seem to
focus on defaults but not on the propriety of the
move to begin with. Help!

posts: 162

Jorge Llambas wrote:
> --- Bob LeChevalier wrote:
>
>
>>Jorge Llambas wrote:
>>
>>>If we forbid all CiV, what happens with rather entrenched names
>>>like {nitcion}?
>>
>>I pronounce that name with three syllables "nitci,on" and always have.
>
> So you would allow it as three syllables? (Nick pronounces it
> with two syllables, BTW.)

I don't see a problem with 3 syllable versions, either in pronunciation
or resolvability.

> In the case of names, number of syllables is irrelevant,
> but in the case of fu'ivla it is relevant. If {CiV} is
> allowed and it counts as two syllables, then {tci,o} is a
> valid fu'ivla. If it is allowed, but counts as one syllable, it
> is not a valid fu'ivla.
>
> I think there are far too many names in use with CiV to
> disallow it. I don't see any problem in allowing the
> pronunciation as two syllables, as long as the "glide syllable"
> does not count, just like y-syllables, consonantal syllables
> and buffer-vowel syllables don't count, for penultimate
> stress purposes. Would that be an acceptable compromise?

I certainly don't mind looser rules for names, since their resolvability
depends mostly on the final consonant, and is not much affected by
sloppy pronunciation.

I am far more nervous about fu'ivla, and especially Type IVs, because
we've KNOWN that the "slinku'i" test alone makes the things hard to make
properly. If fu'ivla were sure to be rare in usage, and Type IVs
especially were made so seldom that some academy-like function of a byfy
could review and approve them individually (which is what I probably
had in mind at one point), I might fret a bit less. But Pierre's
activities have convinced me that some will make unnecessary Type IVs,
and that my efforts to keep them the undesirable stepchildren of the
language may come to naught, and they need more restrictions than names
in any event, so overall whenever there is doubt, I prefer them as
restricted as possible.

lojbab


posts: 2388



> Jorge Llambías wrote:
> > If we forbid all CiV, what happens with
> rather entrenched names
> > like {nitcion}?
>
> I pronounce that name with three syllables
> "nitci,on" and always have.
>

I am really rusty on all this, but this remark
seems to imply that it is possible in some cases
to have two successive syllables with nothing
between their vowels. Is this merely a variant
(not quite official but a concession to human
frailty) for some diphthongs in names or is it 1)
a possibility (indeed the only possibility) for
nondiphthongs lie /ea/ or/ao/. Is it only in
names or might fuhivla develop such have such
forms? CLL seems to say that /'/ has always to
go between two vowels in separate syllables but
then turns around and gives at least one example
of not: /meiin/ which then gets two different
possible (phonological) interpretations /mei,in/
(default) and /me,iin/ which requires a comma.
Is a puzzlement whivch, I assume, has been
resolved somewhere but I can't find it — except
to see several cases in discussions which seem to
focus on defaults but not on the propriety of the
move to begin with. Help!





posts: 162

Jorge Llambas wrote:
> --- Bob LeChevalier wrote:
>
>
>>Jorge Llambas wrote:
>>
>>>If we forbid all CiV, what happens with rather entrenched names
>>>like {nitcion}?
>>
>>I pronounce that name with three syllables "nitci,on" and always have.
>
> So you would allow it as three syllables? (Nick pronounces it
> with two syllables, BTW.)

I don't see a problem with 3 syllable versions, either in pronunciation
or resolvability.

> In the case of names, number of syllables is irrelevant,
> but in the case of fu'ivla it is relevant. If {CiV} is
> allowed and it counts as two syllables, then {tci,o} is a
> valid fu'ivla. If it is allowed, but counts as one syllable, it
> is not a valid fu'ivla.
>
> I think there are far too many names in use with CiV to
> disallow it. I don't see any problem in allowing the
> pronunciation as two syllables, as long as the "glide syllable"
> does not count, just like y-syllables, consonantal syllables
> and buffer-vowel syllables don't count, for penultimate
> stress purposes. Would that be an acceptable compromise?

I certainly don't mind looser rules for names, since their resolvability
depends mostly on the final consonant, and is not much affected by
sloppy pronunciation.

I am far more nervous about fu'ivla, and especially Type IVs, because
we've KNOWN that the "slinku'i" test alone makes the things hard to make
properly. If fu'ivla were sure to be rare in usage, and Type IVs
especially were made so seldom that some academy-like function of a byfy
could review and approve them individually (which is what I probably
had in mind at one point), I might fret a bit less. But Pierre's
activities have convinced me that some will make unnecessary Type IVs,
and that my efforts to keep them the undesirable stepchildren of the
language may come to naught, and they need more restrictions than names
in any event, so overall whenever there is doubt, I prefer them as
restricted as possible.

lojbab






posts: 1912


> --- Bob LeChevalier wrote:
> > I pronounce that name with three syllables
> > "nitci,on" and always have.
>
> I am really rusty on all this, but this remark
> seems to imply that it is possible in some cases
> to have two successive syllables with nothing
> between their vowels. Is this merely a variant
> (not quite official but a concession to human
> frailty) for some diphthongs in names or is it 1)
> a possibility (indeed the only possibility) for
> nondiphthongs lie /ea/ or/ao/.

There isn't a clear official rule yet. If they are
allowed then they would be pronounced as two syllables,
yes.

> Is it only in
> names or might fuhivla develop such have such
> forms?

I don't see much reason to have different phonotactics for
names, they are part of the same language after all. In any
case, in CLL there are fu'ivla such as {kulnrkore,a} too.

> CLL seems to say that /'/ has always to
> go between two vowels in separate syllables

In some parts of CLL (chapter 3 mainly), "all words" refers
to cmavo, gismu and lujvo only.

> but
> then turns around and gives at least one example
> of not: /meiin/ which then gets two different
> possible (phonological) interpretations /mei,in/
> (default) and /me,iin/ which requires a comma.
> Is a puzzlement whivch, I assume, has been
> resolved somewhere but I can't find it — except
> to see several cases in discussions which seem to
> focus on defaults but not on the propriety of the
> move to begin with. Help!

Not resolved yet.

mu'o mi'e xorxes





__
Do you Yahoo!?
Yahoo! Sports - Sign up for Fantasy Baseball.
http://baseball.fantasysports.yahoo.com/



posts: 1912


Let's consider coda-less syllables first, i.e. syllables that
don't end in a consonant. For each consonant, we have 10
such syllables:

ba, be, bi, bo, bu, bai, bau, bei, boi, by
ca, ce, ci, co, cu, cai, cau, cei, coi, cy
da, de, di, do, du, dai, dau, dei, doi, dy
etc.

In addition to those 170 syllables, there are also 20
syllables with special consonants, {.} and {'}:

..a, .e, .i, .o, .u, .ai, .au, .ei, .oi, .y
'a, 'e, 'i, 'o, 'u, 'ai, 'au, 'ei, 'oi, 'y

Those are special because they can only appear at the beginning
of a word {.} or never at the beginning of a word {'}.

Two other special cases are the semi-consonants, or glides,
{i} and {u}:

ia, ie, ii, io, iu, iai, iau, iei, ioi, iy
ua, ue, ui, uo, uu, uai, uau, uei, uoi, uy

We don't know yet the exact rules for these. They can certainly
appear at the beginning of a word (e.g. {ui}), and probably
also in other places: {tropa,io,lo}, {smacrkoba,iu}. I don't
think it is necessary to require a pause in front of them
when they appear at the beginning of a word, they can be like
any other consonant.

For each initial permissible cluster we also have 10 open
syllables:

bla, ble, bli, blo, blu, blai, blau, blei, bloi, bly
tra, tre, tri, tro, tru, trai, trau, trei, troi, try
ckla, ckle, ckli, cklo, cklu, cklai, cklau, cklei, ckloi, ckly
etc.

What is not clear is whether consonant+glide or
permissible-initial-cluster+glide can be considered
a valid initial cluster or not. Many of these have
certainly been used, even though their status has not
been clear:

mia, mie, mii, mio, miu, miai, miau, miei, mioi, miy
xua, xue, xui, xuo, xuu, xuai, xuau, xuei, xuoi, xuy
tcia, tcie, tcii, tcio, tciu, tciai, tciau, tciei, tcioi, tciy
jgrua, jgrue, jgrui, jgruo, jgruu, jgruai, jgruau, jgruei, jgruoi, jgruy
etc.

(apostrophe + glide is almost certainly not allowed:
'ia, 'ie, 'ii, etc. so I'm not including them here.)

In addition to all the syllables mentioned above
(17 + 2 + 2 + 64 + 34 + 128)*10 = 2470
we have for each of them 17 more possibilities by adding
a final coda consonant. That gives 2470*18 = 44460 syllables
with a vocalic nucleus. To that we can add the 16*4 = 64
consonantal syllables for a grand total of 44524 possible
syllables.

29160 of them are still under discussion.

mu'o mi'e xorxes






__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail



Jorge Llamb��)B�as scripsit:

most of excellent analysis of syllables snipped

> I don't think it is necessary to require a pause in front of them
> when they appear at the beginning of a word, they can be like any
> other consonant.

I'm not happy with this change, primarily out of conservatism. It will
work in the morphology algorithm, but it changes the validity conditions
for fu'ivla because words that would otherwise not fall apart now will
do so.

> What is not clear is whether consonant+glide or
> permissible-initial-cluster+glide can be considered
> a valid initial cluster or not. Many of these have
> certainly been used, even though their status has not
> been clear:
>
> mia, mie, mii, mio, miu, miai, miau, miei, mioi, miy
> xua, xue, xui, xuo, xuu, xuai, xuau, xuei, xuoi, xuy

I'm okay with these as sequences but not as syllables; that is, if the
consonant can be interpreted as the coda of the previous syllable.

> tcia, tcie, tcii, tcio, tciu, tciai, tciau, tciei, tcioi, tciy

I don't think this is a particularly good example, because it's an
affricate. I would be OK with a rule that allowed iV and uV (but not
ii and uu) after tc, ts, dj, dz but not any other cluster.

> jgrua, jgrue, jgrui, jgruo, jgruu, jgruai, jgruau, jgruei, jgruoi, jgruy

I hope no one wants these!

> (apostrophe + glide is almost certainly not allowed:
> 'ia, 'ie, 'ii, etc. so I'm not including them here.)

I agree.

> In addition to all the syllables mentioned above
> (17 + 2 + 2 + 64 + 34 + 128)*10 = 2470
> we have for each of them 17 more possibilities by adding
> a final coda consonant. That gives 2470*18 = 44460 syllables
> with a vocalic nucleus. To that we can add the 16*4 = 64
> consonantal syllables for a grand total of 44524 possible
> syllables.

Note that under this scheme "rl." is no longer ambiguous: it's definitely
a non-syllabic r followed by a syllabic l. Are there still words with
two consecutive r/l/m/n where it's not clear which is the syllabic one?

--
John Cowan jcowan@reutershealth.com www.reutershealth.com www.ccil.org/~cowan
Heckler: "Go on, Al, tell 'em all you know. It won't take long."
Al Smith: "I'll tell 'em all we *both* know. It won't take any longer."



posts: 1912


> Jorge Llamb��)B�as scripsit:
>
> most of excellent analysis of syllables snipped
>
> > I don't think it is necessary to require a pause in front of them
> > when they appear at the beginning of a word, they can be like any
> > other consonant.
>
> I'm not happy with this change, primarily out of conservatism. It will
> work in the morphology algorithm, but it changes the validity conditions
> for fu'ivla because words that would otherwise not fall apart now will
> do so.

But they are extremely weird words, such as {leuiski}, which will
become {le uiski} instead of being taken as a single word. And most
of those can't be claimed by conservatives because they violate
the 5-letter rule anyway. If we take into account the 5-letter
rule, Only things beginning with .VIVCC, such as {.euiski} would
be affected. Treatment of syllables after the first consonant
cluster doesn't change.

> > What is not clear is whether consonant+glide or
> > permissible-initial-cluster+glide can be considered
> > a valid initial cluster or not. Many of these have
> > certainly been used, even though their status has not
> > been clear:
> >
> > mia, mie, mii, mio, miu, miai, miau, miei, mioi, miy
> > xua, xue, xui, xuo, xuu, xuai, xuau, xuei, xuoi, xuy
>
> I'm okay with these as sequences but not as syllables; that is, if the
> consonant can be interpreted as the coda of the previous syllable.

So you wouldn't accept the (in use) name {sanxiyn.} for example.

> > tcia, tcie, tcii, tcio, tciu, tciai, tciau, tciei, tcioi, tciy
>
> I don't think this is a particularly good example, because it's an
> affricate. I would be OK with a rule that allowed iV and uV (but not
> ii and uu) after tc, ts, dj, dz but not any other cluster.
>
> > jgrua, jgrue, jgrui, jgruo, jgruu, jgruai, jgruau, jgruei, jgruoi, jgruy
>
> I hope no one wants these!

Well, you would presumably allow {aj,gr,ua}, if not {a,jgrua}.

> > (apostrophe + glide is almost certainly not allowed:
> > 'ia, 'ie, 'ii, etc. so I'm not including them here.)
>
> I agree.
>
> > In addition to all the syllables mentioned above
> > (17 + 2 + 2 + 64 + 34 + 128)*10 = 2470
> > we have for each of them 17 more possibilities by adding
> > a final coda consonant. That gives 2470*18 = 44460 syllables
> > with a vocalic nucleus. To that we can add the 16*4 = 64
> > consonantal syllables for a grand total of 44524 possible
> > syllables.
>
> Note that under this scheme "rl." is no longer ambiguous: it's definitely
> a non-syllabic r followed by a syllabic l. Are there still words with
> two consecutive r/l/m/n where it's not clear which is the syllabic one?

The rule I use is: for any string of consonants, take as long
a permissible initial from the right as possible. Then as many
consonantal syllables as you can from the right. Then if there's
nothing left, that's ok, if there is one consonant left, it has
to be a coda, if more than one consoant is left, the cluster
is impermissible.

I'm not especially happy with that rule though. It means I
would not allow {apkmro} for example, even though in principle
it could be {ap,km,ro}.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Sports - Sign up for Fantasy Baseball.
http://baseball.fantasysports.yahoo.com/



posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > --- Bob LeChevalier wrote:
> > > I pronounce that name with three syllables
> > > "nitci,on" and always have.
> >
> > I am really rusty on all this, but this
> remark
> > seems to imply that it is possible in some
> cases
> > to have two successive syllables with nothing
> > between their vowels. Is this merely a
> variant
> > (not quite official but a concession to human
> > frailty) for some diphthongs in names or is
> it 1)
> > a possibility (indeed the only possibility)
> for
> > nondiphthongs lie /ea/ or/ao/.
>
> There isn't a clear official rule yet. If they
> are
> allowed then they would be pronounced as two
> syllables,
> yes.
>
> > Is it only in
> > names or might fuhivla develop such have such
> > forms?
>
> I don't see much reason to have different
> phonotactics for
> names, they are part of the same language after
> all. In any
> case, in CLL there are fu'ivla such as
> {kulnrkore,a} too.
>
> > CLL seems to say that /'/ has always to
> > go between two vowels in separate syllables
>
> In some parts of CLL (chapter 3 mainly), "all
> words" refers
> to cmavo, gismu and lujvo only.
>
> > but
> > then turns around and gives at least one
> example
> > of not: /meiin/ which then gets two different
> > possible (phonological) interpretations
> /mei,in/
> > (default) and /me,iin/ which requires a
> comma.
> > Is a puzzlement whivch, I assume, has been
> > resolved somewhere but I can't find it --
> except
> > to see several cases in discussions which
> seem to
> > focus on defaults but not on the propriety of
> the
> > move to begin with. Help!
>
> Not resolved yet.
>
I have an audio-visual memory (somewhat more
reliable than a merely factual one) of an early
Logfest (Tommy and Athelstan present) discussing
vowel-vowel contact across syllables. Keeping
them out seemed to be winning because 1) same
vowel pairs easily flowed into single vowels but
"a little bit longer" and the single vowels
simply, 2) dissimilar pairs easily developed (in
performance or perception) either a) a glide, b)
a voiceless transition that was perceived (and
probably pronounced) as the appropriate /h/ or c)
(to avoid the first two) a glottal stop. Since
such things never occur in rafsi, they do not
occur in gismu or lujvo, meaning they are needed
only in cmene and fuhivla, the most easily molded
to Lojban standards. Apparently the decision
unltimately went the other way and both the glide
and the /h/ but not the glottal stop version of
comma is allowed in CLL, apparently in contrast
to ordinary occurrences of /'/ and glides in
diphthongs (which can apparently also occur in
direct contact with another vowel). This can
presumably all be sorted out in the written
langauge so that an slicing algorithm will work,
but it seems to present some problems for a lexer
at a later point: what exactly are spoken /eia/,
/eha/, /ea/ and /eiia/?



posts: 14214

On Sat, Feb 26, 2005 at 07:03:02PM -0500, Bob LeChevalier wrote:
> Robin Lee Powell wrote:
> >>Finding anything on the tiki when I am responding to email in
> >>realtime isn't practical.
> >
> >It seems to be practical for everyone else. Is there anything I
> >can help you with that might make it better for you?
>
> I doubt it. My only wish list has been that when Jorge (or
> another shepherd) changes his proposal for the umpteenth time, the
> message would include a diff listing in addition to the new page
> in the message text.

Yeah, that's been on my to-do list for Some Time.

> I get "permission denied you cannot view this page listing" when I
> click on the diff link

Really? It shouldn't be doing that.

> (it might be wanting me to login, but since I'm clicking from
> email and not planning to post, I don't have any reason to login).

I'm sorry; I think I had seen that issue but hadn't thought of it
from a useability perspective. It's fixed now.

> >>In the far future, it is unlikely that anyone will even remember
> >>that this discussion took place.
> >
> >It's possible, however you may want to note the comments at the
> >top of pages like
> >http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+Quotations
> >which are designed to handle exactly this sort of problem.
>
> That's nice, and if I ever learn to use the tiki again, I'll try
> to figure out how I'm supposed to get one of these comments in - I
> thought that voted-on sections are locked and placed in an
> archive,

They are; that's not the archived version.

> and even when not locked can only be edited by the shepherd or by
> you.

No, we've never enforced that, for the simple reason that it's
impossible to hide the fact that you've edited something.

> I thought that as a non-shepherd, I'm pretty much limited to
> voting and to the discussions that I never can keep up with. I
> see nothing in the procedures that tells me how those comments get
> added (and I just rechecked).

That's because it's entirely ad hoc. If it's important to you, I'll
put something in the procedures.

A lot of the day-to-day procedural stuff has been me doing things
and no-one objecting. When I'm doing something outside of the
procedures, I try to say so.

> And, unfortunately, by enabling me to read and respond by email, I
> now almost never actually look at a wiki site directly, and will
> thus have to relearn the interface

I'm fine with you posting requests for changes here, a la "Hey,
jatna, there seems to be a problem with la'o, to wit blah blah blah.
Can you stick a note in the section?".

> >>If you reread the first line above, Jorge said we should ask to
> >>reopen la'o, implying that he also thought it was possible to do
> >>so. I see no procedure prior to that "far future" event for
> >>reopening anything.
> >
> >Because you haven't read the procedures lately, let me direct you
> >to the relevant comment in:
> >
> >http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Procedure
> >
> >which is:
> >
> > Please note that a particular section can be opened more than
> > once. In particular, a future checkpoint can re-open a
> > section if a problem with the previously approved proposal is
> > discovered.
>
> I get "Sorry, "BPFK Procedure" has not been created". But I was
> smart enough to look from there and find it should have been "BPFK
> Procedures".

Excellent. Go you. Sorry about the typo.

> So having looked and confirmed that the above is all that that
> page says on the matter, my stupid question of the moment is "what
> is the procedure for deciding that a previously approved proposal
> has a problem which warrants reopening the page?"

Negative consensus plus one. :-)

If more than one person thinsk there's a problem, as far as I'm
concerned, there's a problem.

Again, this has largely been a matter of me doing things over
no-one's objections. If you would prefer to see a procedure written
up, I can do that.

> >>In addition, I'm not sure that there is even a byfy "checkpoint"
> >>section on the chapter of the morphology that discusses the
> >>progression of Type I-IV borrowings, which is what the issue
> >>really is.
> >
> >BPFK checkpoints are, in general, against sets of cmavo. There
> >are special cases for morphology, gismu in general, and
> >miscellaneous issues.
> >
> >>Originally none of the morphology was part of the checkpoint
> >>system but was a separate subcommittee under Nora.
> >
> >Because she hadn't been able to work on that subcommittee for
> >something like three years, we moved that work into the
> >checkpoint system.
> >
> >I got little response to my emails to her about this issue, and
> >both of you ignored my mail of last week asking for a response to
> >that fact.
>
> Please identify the email more specifically; I don't recall seeing
> any specifically asking Nora or me anything.

Subject: WikiDiscuss Re: PEG Morphology Algorithm
From: Robin Lee Powell <rlpowell@digitalkingdom.org>
To: wikidiscuss-list@lojban.org
Date: Wed, 16 Feb 2005 15:34:05 -0800

On Wed, Feb 16, 2005 at 09:33:22AM -0500, Bob LeChevalier wrote:
snip
> I would rather that Robin pull the topic off the
> time-limited-voting floor, and you and Nora work together to
> do the job right (in a way that satisfies her).

Again, past behaviour gives me no reason to believe Nora will do
the work. If she mails me privately and requests this, with
some sort of firm commitement about the time she can put in, I'd
be *happy* to do this. I'm sure xorxes would as well. (And if
he's not, I'll *make* him happy about it. :-)

-Robin

> Nora normally reads email once a week, and has been skipping all
> byfy traffic because you told her that she doesn't need to read
> the discussions - only the proposals.

I assume this must be hyperbole, because she's sent 4 mails to it in
the last week, and I can't believe she would be rude enough to
ignore a forum that she had posted to.

> I knew that you had been working on the PEG Morphology algorithm
> last fall,

That was the grammar, actually.

> but my understanding was that this was just another project and
> not part of the byfy - we ignored it just like we've ignored
> valfendi - it was not official.

Perhaps you didn't see me, jcowan, and xorxes all saying (on the
main list) that we intended the PEG grammar to replace the YACC
grammar?

> My first awareness that the PEG morphology was to be considered
> for byfy was your announcement on 2/11 giving us 2 weeks before a
> vote.

I can understand how that happened if you're not reading BPFK mail
in general. Since neither of you have participated in the rest of
the BPFK, for the most part, I didn't realize that you might want
special notification of the morphology issue. Sorry about that.

> Magic Words meant). My email subject directory has
> "bpfk-announce Partial closure of Magic Words; Mor" so I did not
> even realize it mentioned morphology.

Then you need to get a better mail program, as I've said to you many
times.

> (I actually do this with all tiki traffic, since I haven't got the
> skill with filtering to separate tiki stuff from byfy stuff. Most
> of my email reading is still tossing out hundreds of spam
> messages, and then skimming and binning whatever is left - you
> should know by now that I do respond to your emails directly to
> me, and that if I don't respond, it is probably because I somehow
> did not see it.)

All true. I'm pretty sure that with a bit of work on my part, we
could get your spam down to near zero, but that would require you
pulling your mail from my server, which you've consistently refused
to do in the past. Oh well.

> >>Then out of the blue
> >
> >This has certainly not been out of the blue. We've been
> >discussing this extensively for months on wikidiscuss.
>
> So what? When did something being discussed on the general wiki
> become a byfy proposal?

When the first edit or discuss post occured to

http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+Formal+Morphology

(which appears to be wrong; sometimes Tiki loses history)

or the creation of

http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+PEG+Morphology+Algorithm

which appears to have been 17 Dec 2004.

> I've now searched all my Lojban related email for mention of
> Nora's name or the word "morphology". The prior mention of Nora
> by you seems to have been compliments of her Noralujv work on
> 1/6/05.

Then you're missing something.

> >In the past three years, and most especially in the last six
> >months or so, work on this committee has proceeded this far
> >without input,
>
> There has been no "committee" that Nora knew of

I'm sorry; BPFK commission, my bad.

http://groups.yahoo.com/group/lojban/message/19146

> - just her.

Then she was mistaken.

> There was no interest except from Pierre and he started off
> talking about doing something totally orthogonal to what Nora
> wanted to do, based on valfendi, so that never went beyond the
> first message a couple years ago.

Then someone should have informed the BPFK jatna.

> Once byfy work started in earnest under your jatnaship, her very
> limited Lojban time went towards vainly trying to keep up with
> some of the other byfy work, so of course she wasn't going to be
> doing anything on some other topic.

If you say so. All I know is that the last mail I sent to her on
this:

Subject: Morphology?
To: phma@phma.hn.org, Nora LeChevalier <noras>
Date: Thu, 4 Nov 2004 14:33:16 -0800

Just out of curiosity, how's the morphology thing going?

-Robin

received no response and that, in general, I have not been able to
rely on her replying to any mail I have sent.

> >>you gave us 2 weeks notice for a vote
> >
> >As I've said several times in wikidiscuss, I thought we were done
> >because the people who were actually doing the work had mostly
> >stopped arguing.
>
> It was never moved from the general wiki discussion into something
> that the byfy should take notice of until 2/8 or 2/11.

Was there something confusing about "BPFK Section" in the subject
line?

> >>on the PEG morphology algorithm, which raises issues about the
> >>morphology as a whole,
> >
> >The intention is that the PEG algorithm *is* the morphology. I
> >thought this was quite clear.
>
> The current baseline description of the phonology and morphology
> is Chapters 2 and 3 of CLL.

And we've found, what, two or three major internal contradictions so
far?

> I would expect that a replacement baseline for what is in CLL
> would be written in the sort of language that CLL is written in,
> not in something EBNFish.

Then you and I have an incontrovertable difference of opinion.

If the results of the BPFK does not include a formalized morphology,
I will go find something else to do with my time.

I've had at least three mails from people saying that the would be
more involved with Lojban if the morphology was fixed, by the way,
and I'm reasonably certain that they were all looking for something
formalized.

> >We need a formalized morphology, in the same way we have a
> >formalized grammar.
>
> I'll accept your ruling to that effect, but it was never voted
> thusly by byfy.

You can call a vote at any time.

> There is a formalized grammar, but there is also 600-odd pages of
> CLL explaining that grammar, and the explanation is at least as
> important a part of the grammar baseline as the YACC
> specification.

Grammars have semantic meaning. Morphologies don't. Or at least
not in the same way.

> I've never seen a similar tool to a YACC program for verifying the
> unambiguity of an EBNF to some standard (it needn't be LALR-1).

They exist.

> Similarly, there have been non-YACC parsers written for Lojban, the one
> I know best being Jeff Prothero's recursive descent parser. From what
> little I understand technically about parsers, proving that any given
> parser is the equivalent of the YACC grammar and associated parser is
> non-trivial.

Quite amazingly non-trivial, yes. Undecidable in general, in fact.

> (I have just now stopped writing and spent a half hour looking up
> PEG grammar on the web,

Good for you!

> so I know that it is claimed to be unambiguous because of
> prioritization rules; I don't know how this is proven, but I'll
> take their word for it.

There is, definitionally, only one way to read it; therefore it
can't be ambiguous.

> That means that what remains is to prove that whatever the PEG
> grammar produces happens to be the breakdown that a human being
> thinks it should be.)

Yes. I have (as I've posted on the main list) ~40 thousand lines of
test cases trying to demonstrate this.

> >>There is no checkpoint on the morphology rules themselves, only
> >>on the PEG algorithm (or am I mistaken).
> >
> >The whole point here is that the PEG algorithm *IS* the
> >morphology rules, in exactly the same way that the YACC algorithm
> >*is* the grammar rules.
>
> I think I've addressed this above. We cannot know if they are the
> rules (other than by fiat) without what JCB called 'verifying that
> the "machine grammar" matches the "human grammar"'. The human
> morpher has to be the standard UNTIL we approve a machine morpher
> that fits it. Thereafter, we can use the machine morpher as the
> standard with some trust.

Of course. That is exactly the process that we're trying to go
through, with the added complication that the human language
morphology is known to be broken.

> Formalizing those descriptions would be a good thing, I agree.

Yaay!

> But in fact because of sloppiness in how we wrote up stuff on
> fu'ivla and to a lesser extent on names and experimental cmavo,
> what is emerging in discussion is that what Jorge and Pierre did
> was NOT formalizing the existing CLL text, but creating a
> formalization which was based on the CLL, but which incorporates
> an unknown number of unspecified decisions resolving issues in
> that CLL text, and in some cases making changes to what is stated
> in the CLL text.

The CLL text is internally contradictory. Besides, voting on what
is and is not acceptable tweaking is what the rest of us are here
for. xorxes has proposed that each deviation from what the average
Lojban user would expect be a seperate vote, and I agree that that's
how the morphology vote should be done.

> Now you will recall that I have all along wanted to vote on each
> change as a *change* to a known status quo

See above. Also note that the known status quo is internally
contradictory.

> I've accepted that you will continue this way. I have no desire
> to replace you, since something is getting done at least. But I'm
> not in the least happy with the lack of conservatism in the byfy
> decisions thus far, so I will continue to protest.

You've said you and Nora can't do the work, and you're going to
continue to projest. I call that "whining". If you wish to
continue to whine, that's fine, but please do not expect continued
attention to it from me. In fact, I'd go so far as to say that you
should not expect me to read mails from you on BPFK procedural
issues; I don't have the energy left. If you think something does
need my specific attention, change the subject and send it
privately.

If, on the other hand, you wish to do something other than whine,
you're going to have to find someone to do the conservative work
that you feel needs to be done.

I want you to think very, very hard about something, though:

At a minimum, Broca, xorxes, treed, Jay and I all speak Lojban
much, much better than you do. jcowan probably as well. I
apologize if that seems insulting, but if you need any form of
convincing just visit IRC some time. We use this language for
actual conversation basically every single day.

You need to sit down and seriously think about the fact that we all
seem to come to consensus very, very fast, with you and Nora as the
sole dissenters. If the people who *actually* *use* the language
all agree that something desperately needs fixing, perhaps you
should seriously consider the possiblity that we're *right*.

Again, I apologize if the above seems insulting; it's not my
intention.

Furthermore, you wouldn't know it because you haven't been reading
the stuff, but jcowan, Broca and I have blocked xorxes tinkering
many times now.

> You are of course free to continue to overrule my protests.

And you are free to call a vote if you want something changed.

> But if my protests get you so steamed up that you cannot
> effectively function,

My anger is my problem. I choose to ignore your whining, as I said
above, but the *LAST* thing I want is for you to participate *less*.

> (I've toyed with the idea of going off and writing a few thousand
> words on Lojban as I speak it as a possible counterweight to
> xorxes zei bangu - probably my long procrastinated Arabian Nights
> translation - but that would require me to entirely tune out byfy,
> and I'm sure that everything I produce will end up garbage once
> byfy finishes changing the language beyond my recognition, which
> is my fear as to what will happen if I do anything other than what
> I am doing).

I can imagine very few things that would make me happier at this
stage of the game.

As xoxres said, you might also want to read some of la .alis.

http://www.lojban.org/texts/translations/alice/alice.html

or la nicte cadzu

http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/palm/la%20nicte%20cadzu/index.html

(which, by the way, is about to break 20K words; watch out xorxes!)
and you might be pleasently surprised.

> Oh well, sorry for being excessively wordy again.

I care about your wordiness to exactly the extent that I fear that
it takes away from your time.

> And if this has gotten you too steamed up to reply, I suggest
> getting Cowan's input since he gets you less steamed and he seems
> to understand Lojbabese %^)

He was one of the four people I asked to edit the mail you replied
to. :-)

> And for all my protests, you still have my vote of confidence.

Thank you.

-Robin




posts: 1912


> I have an audio-visual memory (somewhat more
> reliable than a merely factual one) of an early
> Logfest (Tommy and Athelstan present) discussing
> vowel-vowel contact across syllables. Keeping
> them out seemed to be winning because 1) same
> vowel pairs easily flowed into single vowels but
> "a little bit longer" and the single vowels
> simply, 2) dissimilar pairs easily developed (in
> performance or perception) either a) a glide, b)
> a voiceless transition that was perceived (and
> probably pronounced) as the appropriate /h/ or c)
> (to avoid the first two) a glottal stop.

None of those things seem to happen in Spanish with
different vowel-vowel pairs. For example "fea" is /fea/
with no hint of an intervening i glide as some English
speakers would tend to introduce, nor any voiceless
transition or glottal stop. In fact, the opposite effect
occurs in some dialects, where intervovalic b/d/g tends
to disappear, so that "lado" ends up as /lao/.

So I wonder if this tendency of vowel pairs to develop
something between them is a universal tendency or just
a tendency for English and maybe some other languages.

Doubled vowels "ee" and "oo" also occur in Spanish
("leer", "coordinar"). In rapid speech they may tend
to be pronounced as a single vowel, but they can be
made distinct in careful pronounciation.

> Since
> such things never occur in rafsi, they do not
> occur in gismu or lujvo, meaning they are needed
> only in cmene and fuhivla, the most easily molded
> to Lojban standards. Apparently the decision
> unltimately went the other way and both the glide
> and the /h/ but not the glottal stop version of
> comma is allowed in CLL, apparently in contrast
> to ordinary occurrences of /'/ and glides in
> diphthongs (which can apparently also occur in
> direct contact with another vowel). This can
> presumably all be sorted out in the written
> langauge so that an slicing algorithm will work,
> but it seems to present some problems for a lexer
> at a later point: what exactly are spoken /eia/,
> /eha/, /ea/ and /eiia/?

Not sure what the question is.

{e'a}, {ea}, {eia} and {eiia} are all different. For me,
only the last two could be confused, but I know that
will vary a lot from person to person. Currently the PEG
should allow {e'a} and {eia} and disallow {ea} and
{eiia}, because it does not admit nucleus-nucleus (e-a)
nor diphthong-glide (ei-ia), but nucleus-glide (e-ia) is
accepted. Nothing is definite yet, though.

mu'o mi'e xorxes




__
Do you Yahoo!?
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250



On Tue, 1 Mar 2005 13:49:22 -0800, Robin Lee Powell
<rlpowell@digitalkingdom.org> wrote:
> On Sat, Feb 26, 2005 at 07:03:02PM -0500, Bob LeChevalier wrote:
> > I get "permission denied you cannot view this page listing" when I
> > click on the diff link
>
> Really? It shouldn't be doing that.

I get this, too. Then I log in and tick the box marked 'remember me'.

Next time I click a diff link (on a different day, after having closed
my browser/restarted my computer), I get "permission denied" again.

mu'o mi'e .filip.
--

Philip Newton <philip.newton@gmail.com>




posts: 162

Robin Lee Powell wrote:
> Subject: WikiDiscuss Re: PEG Morphology Algorithm
> From: Robin Lee Powell <rlpowell@digitalkingdom.org>
> To: wikidiscuss-list@lojban.org

So it wasn't specifically to Nora or me as I think you claimed. I just
looked and indeed I had not read the message. I read a bunch of
messages on the wiki threads from the 14th to the 16th, and then didn't
read again until around the 18th and then only spottily.

Stuff with the "WikiDiscuss marker that I don't read immediately get
put into my "read someday" byfy file which has over thousands of
messages in it since last March, and "someday" rarely comes around.
I've tried in fits and starts to keep up with posts, but even in the
morphology thread it exceeds what time I can spend.

Even worse has been when I separated stuff mentioning the byfy in the
subject line from the rest. As noted below, using that, the bulk of the
PEG Morphology discussion has been non-byfy. Once I realized the line
between byfy and non-byfy on the wiki is so poor, I remerged the files,
but that gives some 3600 unread over the last year. Rather intimidating.

> Date: Wed, 16 Feb 2005 15:34:05 -0800
>
> On Wed, Feb 16, 2005 at 09:33:22AM -0500, Bob LeChevalier wrote:
> snip
> > I would rather that Robin pull the topic off the
> > time-limited-voting floor, and you and Nora work together to
> > do the job right (in a way that satisfies her).
>
> Again, past behaviour gives me no reason to believe Nora will do
> the work. If she mails me privately and requests this, with
> some sort of firm commitement about the time she can put in, I'd
> be *happy* to do this. I'm sure xorxes would as well. (And if
> he's not, I'll *make* him happy about it. :-)
>
> -Robin
>
>
>>Nora normally reads email once a week, and has been skipping all
>>byfy traffic because you told her that she doesn't need to read
>>the discussions - only the proposals.
>
> I assume this must be hyperbole, because she's sent 4 mails to it in
> the last week, and I can't believe she would be rude enough to
> ignore a forum that she had posted to.

Alas, you'll need to believe it, because she doesn't have nearly enough
time to read the traffic even on that one thread more often. If not
having time is rude, then I guess she is rude.

I just checked her Eudora logs since the 16th. She spent 2 hours
reading mail on the 15th, 3 hours reading mail on the 24th. There were
sessions less than 10 minutes (long enough time to download and maybe
skim a message or two, but she usually does it in background while I try
to explain what I've seen) on the 17th, 21st, 3 times on the 25th, the
26th, and 28th.

>>I knew that you had been working on the PEG Morphology algorithm
>>last fall,
>
> That was the grammar, actually.

I thought for most of the last few months that you had decided to add a
morpher as a front end to your parser which is what I think jbofi'e
does. I REALLY have been tuned out!

>>but my understanding was that this was just another project and
>>not part of the byfy - we ignored it just like we've ignored
>>valfendi - it was not official.
>
> Perhaps you didn't see me, jcowan, and xorxes all saying (on the
> main list) that we intended the PEG grammar to replace the YACC
> grammar?

I just did a search on "PEG" or "morphology" and found only 2 messages
by you mentioning PEG grammar last summer, and none by Cowan. I also
skimmed list messages by Cowan, and don't see him mentioning parsers
since last April's discussion of parser bugs. I do believe that I did
see some posting somewhere where you indicated that intent, and I know
that Cowan considers the existing parser to be broken, and approved of
your making a new parser. But I did not even know until I read this
message, and based on my "PEG grammar" WWW lookup, that your PEG parser
was using a different grammar from the official one, and that the issues
that arose weren't coming from your recoding the lexer.

>>My first awareness that the PEG morphology was to be considered
>>for byfy was your announcement on 2/11 giving us 2 weeks before a
>>vote.
>
> I can understand how that happened if you're not reading BPFK mail
> in general. Since neither of you have participated in the rest of
> the BPFK, for the most part, I didn't realize that you might want
> special notification of the morphology issue. Sorry about that.

I try to read your announcements, though I haven't been consistent. And
reading only once a week most of the time, I may miss some amidst the
other byfy traffic that I am skipping.

>>Magic Words meant). My email subject directory has
>>"bpfk-announce Partial closure of Magic Words; Mor" so I did not
>>even realize it mentioned morphology.
>
> Then you need to get a better mail program, as I've said to you many
> times.

I did. Subject lines in my summary window will still have a size limit.
I use pretty large print to avoid eyestrain. 1/3 of the column is
taken up by WikiDiscuss leaving

>>(I actually do this with all tiki traffic, since I haven't got the
>>skill with filtering to separate tiki stuff from byfy stuff. Most
>>of my email reading is still tossing out hundreds of spam
>>messages, and then skimming and binning whatever is left - you
>>should know by now that I do respond to your emails directly to
>>me, and that if I don't respond, it is probably because I somehow
>>did not see it.)
>
> All true. I'm pretty sure that with a bit of work on my part, we
> could get your spam down to near zero, but that would require you
> pulling your mail from my server, which you've consistently refused
> to do in the past.

I didn't "refuse" to do anything. There was a reason why I switched off
(can't remember now) and I wasn't aware of any particular desire on your
part to have me switch back.

>>So what? When did something being discussed on the general wiki
>>become a byfy proposal?
>
> When the first edit or discuss post occured to
>
> http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+Formal+Morphology
>
> (which appears to be wrong; sometimes Tiki loses history)
>
> or the creation of
>
> http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+PEG+Morphology+Algorithm
>
> which appears to have been 17 Dec 2004.

I see an email with the Formal morphology page, but there was no
followup after it was created. It has as its complete text "((PEG
Morphology Algorithm))" in that page, so if I had read it, I might have
been forewarned. But ALL of the post on "PEG Morphology Algorithm" from
then until 2/14 seem to not have any "BYFY" indication in the subject
line like all the rest, so pardon me for assuming that it was a non-BYFY
discussion. Only the changes since 2/14 seem to have been posted as
BYFY changes - the discussion thread is STILL a non-byfy thread, so far
as I can tell.

>>I've now searched all my Lojban related email for mention of
>>Nora's name or the word "morphology". The prior mention of Nora
>>by you seems to have been compliments of her Noralujv work on
>>1/6/05.

> Then you're missing something.

I wasn't searching the byfy traffic file, but the mail sent to me
personally, which is what you said (me and Nora). Except for
announcements, byfy discussion doesn't usually get noticed, since you
told us we didn't have to read it. And note that if you merely post to
the byfy discussion and cc: us it will be filtered based on the subject
line as a byfy discussion list message. We won't see it unless we are
already following the thread for some other reason (and then the odds
are mixed).

>>>In the past three years, and most especially in the last six
>>>months or so, work on this committee has proceeded this far
>>>without input,
>>
>>There has been no "committee" that Nora knew of
>
> I'm sorry; BPFK commission, my bad.
>
> http://groups.yahoo.com/group/lojban/message/19146

"Nora will not begin work on the
commission until she has discharged her duties as secretary of the
LLG" - something that wasn't done until we had completed QB turnover
about a year ago.

>>There was no interest except from Pierre and he started off
>>talking about doing something totally orthogonal to what Nora
>>wanted to do, based on valfendi, so that never went beyond the
>>first message a couple years ago.
>
> Then someone should have informed the BPFK jatna.

Perhaps. But since she has felt her priority since you took over and
started pushing things along at a pace we can't keep up with, to try to
keep up with all the other sections, she wasn't likely to start
something new.

>>Once byfy work started in earnest under your jatnaship, her very
>>limited Lojban time went towards vainly trying to keep up with
>>some of the other byfy work, so of course she wasn't going to be
>>doing anything on some other topic.
>
>
> If you say so. All I know is that the last mail I sent to her on
> this:
>
> Subject: Morphology?
> To: phma@phma.hn.org, Nora LeChevalier <noras>
> Date: Thu, 4 Nov 2004 14:33:16 -0800
>
> Just out of curiosity, how's the morphology thing going?
>
> -Robin
>
> received no response and that, in general, I have not been able to
> rely on her replying to any mail I have sent.

If you want an answer from her, and don't cc me, it probably won't get
seen. Back in November, I doubt that she read any mail at all.

>>>>you gave us 2 weeks notice for a vote
>>>
>>>As I've said several times in wikidiscuss, I thought we were done
>>>because the people who were actually doing the work had mostly
>>>stopped arguing.
>>
>>It was never moved from the general wiki discussion into something
>>that the byfy should take notice of until 2/8 or 2/11.
>
> Was there something confusing about "BPFK Section" in the subject
> line?

Look at the PEG Morphology Algorithm discussion thread, at least as it
appears in email - no mention of BPFK.

>>>>on the PEG morphology algorithm, which raises issues about the
>>>>morphology as a whole,
>>>
>>>The intention is that the PEG algorithm *is* the morphology. I
>>>thought this was quite clear.
>>
>>The current baseline description of the phonology and morphology
>>is Chapters 2 and 3 of CLL.
>
> And we've found, what, two or three major internal contradictions so
> far?

I dunno. I don't recall much response to Nora's posting of the relevant
sections. Since the first step in a byfy effort should be to clarify
the issues that warrant any change to the baseline, I would have wished
there to be a byfy page that identified these. But I seldom get what I
wish for, and if there was such a page, I never saw it amidst the
thousands of other messages that have flown by.

>>I would expect that a replacement baseline for what is in CLL
>>would be written in the sort of language that CLL is written in,
>>not in something EBNFish.
>
> Then you and I have an incontrovertable difference of opinion.

CLL is the baseline. A change to the baseline is a change to CLL.

> If the results of the BPFK does not include a formalized morphology,
> I will go find something else to do with my time.

I have no opposition to a formalized morphology. The question is
whether that morphology should involve anything that changes CLL except
to clarify or to resolve contradictions. A formalized morphology would
certainly clarify things in a way.

>>I'll accept your ruling to that effect, but it was never voted
>>thusly by byfy.
>
> You can call a vote at any time.

I accept your ruling, I said.

>>Similarly, there have been non-YACC parsers written for Lojban, the one
>>I know best being Jeff Prothero's recursive descent parser. From what
>>little I understand technically about parsers, proving that any given
>>parser is the equivalent of the YACC grammar and associated parser is
>>non-trivial.
>
> Quite amazingly non-trivial, yes. Undecidable in general, in fact.

I probably was told that at some point %
)


>>That means that what remains is to prove that whatever the PEG
>>grammar produces happens to be the breakdown that a human being
>>thinks it should be.)
>
> Yes. I have (as I've posted on the main list) ~40 thousand lines of
> test cases trying to demonstrate this.

The main list? I assume you mean "Lojban List" I recall your mentioning
something on this wikidiscuss list. As I noted above, I don't see ANY
posts about morphology on Lojban List by you. But my searching could be
flaky.

>>I think I've addressed this above. We cannot know if they are the
>>rules (other than by fiat) without what JCB called 'verifying that
>>the "machine grammar" matches the "human grammar"'. The human
>>morpher has to be the standard UNTIL we approve a machine morpher
>>that fits it. Thereafter, we can use the machine morpher as the
>>standard with some trust.
>
> Of course. That is exactly the process that we're trying to go
> through, with the added complication that the human language
> morphology is known to be broken.

Other than the la/doi in names issue, I don't recall seeing any issues
raised until recently. Certainly not in the main lojban list.

>>Now you will recall that I have all along wanted to vote on each
>>change as a *change* to a known status quo
>
> See above. Also note that the known status quo is internally
> contradictory.

I understand. Resolving a contradiction is a change I will usually
support (assuming the resolution makes sense). Likewise, clarifications
that make sense I have no problems supporting.

>>I've accepted that you will continue this way. I have no desire
>>to replace you, since something is getting done at least. But I'm
>>not in the least happy with the lack of conservatism in the byfy
>>decisions thus far, so I will continue to protest.
>
> You've said you and Nora can't do the work, and you're going to
> continue to projest. I call that "whining".

Probably a good description. Can't see anything better that I can do at
this point besides register a futile protest, or abandon any pretense of
participation.

> If, on the other hand, you wish to do something other than whine,
> you're going to have to find someone to do the conservative work
> that you feel needs to be done.
>
> I want you to think very, very hard about something, though:
>
> At a minimum, Broca, xorxes, treed, Jay and I all speak Lojban
> much, much better than you do. jcowan probably as well. I
> apologize if that seems insulting, but if you need any form of
> convincing just visit IRC some time. We use this language for
> actual conversation basically every single day.
>
> You need to sit down and seriously think about the fact that we all
> seem to come to consensus very, very fast, with you and Nora as the
> sole dissenters. If the people who *actually* *use* the language
> all agree that something desperately needs fixing, perhaps you
> should seriously consider the possiblity that we're *right*.

I'm sure it is possible %
)


> Again, I apologize if the above seems insulting; it's not my
> intention.

If I had that thin a skin, I never would have survived so many years as
LLG President. I am sure it is possible for you to make me feel
insulted, but certainly not for failing to follow through on
commitments, even if it is from lack of time.

>>But if my protests get you so steamed up that you cannot
>>effectively function,
>
> My anger is my problem. I choose to ignore your whining, as I said
> above, but the *LAST* thing I want is for you to participate *less*.

I have the choice between byfy and any other lojban activity. byfy
absorbs more time than I have and far more than Nora has, so little else
gets done here Lojbanically and that will likely continue while we
trying to participate in byfy.

>>(I've toyed with the idea of going off and writing a few thousand
>>words on Lojban as I speak it as a possible counterweight to
>>xorxes zei bangu - probably my long procrastinated Arabian Nights
>>translation - but that would require me to entirely tune out byfy,
>>and I'm sure that everything I produce will end up garbage once
>>byfy finishes changing the language beyond my recognition, which
>>is my fear as to what will happen if I do anything other than what
>>I am doing).
>
> I can imagine very few things that would make me happier at this
> stage of the game.

But if I do that, I will not be participating in byfy even to the
limited extent I do now. I simply can't handle more than a few Lojban
messages a day, and I get dozens and no way to prioritize them other
than perhaps to read Cowan's and ignore the rest.

But I will consider it. It would likely take my ceasing to do any byfy
work at all, but perhaps if I am only seen as obstructionist that would
be better than the status quo.

> As xoxres said, you might also want to read some of la .alis.
>
> http://www.lojban.org/texts/translations/alice/alice.html
>
> or la nicte cadzu
>
> http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/palm/la%20nicte%20cadzu/index.html
>
> (which, by the way, is about to break 20K words; watch out xorxes!)
> and you might be pleasently surprised.

If I am going to write/translate OR participate in byfy, I likely won't
have time to read *anything*. I have never been able to think in
Lojban, so reading is for me first of all a word for word translation,
and lookup if I don't know the word (which is seldom for gismu
especially given context, and uncommon for cmavo when I am active, but
I've lost my quasi-mastery of rafsi). Translating will likely be even
harder, especially at first, since my standards for translation are very
high.

I'll probably never be a fast reader of Lojban (or probably any other
language except English - I keep hoping I'll eventually be able to read
Russian that isn't textbook lessons with understanding, but that's
slightly behind my Lojban skill).

>>Oh well, sorry for being excessively wordy again.
>
> I care about your wordiness to exactly the extent that I fear that
> it takes away from your time.

Editing takes more time. I type as I think and vice versa. If that is
wordy, then it reflects my thinking. If I don't type, it is probably
because I am not thinking about the subject.

lojbab




posts: 14214

On Wed, Mar 02, 2005 at 06:54:02AM +0100, Philip Newton wrote:
> On Tue, 1 Mar 2005 13:49:22 -0800, Robin Lee Powell
> <rlpowell@digitalkingdom.org> wrote:
> > On Sat, Feb 26, 2005 at 07:03:02PM -0500, Bob LeChevalier wrote:
> > > I get "permission denied you cannot view this page listing"
> > > when I click on the diff link
> >
> > Really? It shouldn't be doing that.
>
> I get this, too. Then I log in and tick the box marked 'remember
> me'.
>
> Next time I click a diff link (on a different day, after having
> closed my browser/restarted my computer), I get "permission
> denied" again.

This should be fixed now. Let me know if it's not.

Also, I've put diffs into the mails themselves.

-Robin



On Tuesday 01 March 2005 09:36, John Cowan wrote:
> > jgrua, jgrue, jgrui, jgruo, jgruu, jgruai, jgruau, jgruei, jgruoi, jgruy
>
> I hope no one wants these!

We have {jglandi} already. {jgruau} is of course two syllables.

phma
--
Ils pensent que j'ai un cancer du thé russe...
-Les Perles de la médecine



No matter what syllables we disallow in fu'ivla, we are going to have pairs of
words which differ by very little, and slight phonetic changes resulting in
different word boundaries. For instance:
{noltroni'u}, {noltruni'u}. Both words are found in Alice. The first means the
Duchess and the second the Queen. I thought one was a typo for a while.
{fasxolarkto}, {faskolarkto}. The second breaks up.
{lemristugreblispamlo}, {lemristugreblispanlo}. The first breaks up. Both are
lujvo, with completely different meanings.
The only way we can avoid such minimal pairs is to add check digits, which are
alien to human speech processing. So I think we should go back to xorxes'
proposal allowing syllables such as {ctruonk}, which seems to fit existing
fu'ivla fairly well, and not worry about the minimal pairs.

phma
--
..i toljundi do .ibabo mi'afra tu'a do
..ibabo damba do .ibabo do jinga
..icu'u la ma'atman.



posts: 149

Pierre Abbat scripsit:

> We have {jglandi} already.

That is CCCVC-CV, which is unusual but no problem.

> {jgruau} is of course two syllables.

Not according to the current proposal, by which it is the monstrously
heavyweight syllable CCCIVI. And I speak as an anglophone here;
English has even more awful syllables like CCCVCCC ("strengths").
(English has to be among the world's most coda-heavy languages.)

--
John Cowan cowan@ccil.org www.reutershealth.com www.ccil.org/~cowan
No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friends or of thine own were: any man's death diminishes me,
because I am involved in mankind, and therefore never send to know for
whom the bell tolls; it tolls for thee. --John Donne



posts: 1912


> No matter what syllables we disallow in fu'ivla, we are going to have pairs
> of
> words which differ by very little, and slight phonetic changes resulting in
> different word boundaries.

Yes.

> For instance:
> {noltroni'u}, {noltruni'u}. Both words are found in Alice. The first means
> the
> Duchess and the second the Queen. I thought one was a typo for a while.

We still need to figure out how {noltrini'u}, {noltreni'u} and
{noltrani'u} fit in the hierarchy. :-)

> {fasxolarkto}, {faskolarkto}. The second breaks up.
> {lemristugreblispamlo}, {lemristugreblispanlo}. The first breaks up. Both are
> lujvo, with completely different meanings.

I think this last one gives an excellent argument for going
to a rule similar to what pc proposes: "An unstressed cmavo will

  • always* fall off unless followed by a non-permissible initial

cluster." This greatly simplifies the tosmabru test, because
there is no need to check for rafsi strings, and it eliminates
completely the slinku'i nonsense. With this rule, in both cases
{le} will be a cmavo. The second one would be le + fu'ivla.
To get the lujvo you would need: lemyristugreblispanlo} (which is
already allowed by PEG, though the form without the 'y' is also
allowed.)

> The only way we can avoid such minimal pairs is to add check digits, which
> are
> alien to human speech processing. So I think we should go back to xorxes'
> proposal allowing syllables such as {ctruonk}, which seems to fit existing
> fu'ivla fairly well, and not worry about the minimal pairs.

We have two sets of issues to deal with here:

(1) What syllables are allowed?
(1.1) Do we allow consonant(s)+glide in the onset?
(1.2) Do we allow syllabic+consonant in the coda?

(2) What syllable-syllable restrictions are there?
(2.1) Do we allow a syllable with zero coda to be followed by
one with zero onset?
(2.2) Do we allow a syllable that ends in a diphthong to be followed
by one that starts with a glide?

My current inclination is to answer "Yes" to 1.1, "Maybe" to 1.2
and "No" to 2.1 and 2.2

mu'o mi'e xorxes






__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 149

Pierre Abbat scripsit:

> No matter what syllables we disallow in fu'ivla, we are going to have
> pairs of words which differ by very little, and slight phonetic changes
> resulting in different word boundaries. For instance: {noltroni'u},
> {noltruni'u}.

Of course. There are even worse cases involving nasals and
non-coarticulated stops: renblo (railway ferry) will tend to be heard
and even pronounced as remblo (boat for carrying people?).

> {lemristugreblispamlo}, {lemristugreblispanlo}. The first breaks
> up. Both are lujvo, with completely different meanings.

Wow. le mri-stu-gre-bli-spa-mlo vs. lem-ris-tug-reb-lis-panlo.
Cool example. There are English examples like this; as a child,
I consistently misanalyzed the written word "straphangers" as
"stra-phan-gers", finding it totally mystifying. In fact, it's just
"strap-hang-ers" (those who hang by a strap, i.e. ride the subway).
This involved a shift in pronunciation, but there are examples (which
aren't coming to mind) which do not. I wouldn't be surprised if there
are French examples too.

> The only way we can avoid such minimal pairs is to add check digits,

Well, we could have employed fewer and more robust rafsi, but it's way
too late now.

> So I think we should go back to xorxes' proposal allowing syllables
> such as {ctruonk}, which seems to fit existing fu'ivla fairly well,
> and not worry about the minimal pairs.

CCIVCC is massively too heavyweight to fit the phonotactics of the
Lojban core. It's also too heavyweight for reasonable natlangs (i.e. not
English or Georgian).

--
"That you can cover for the plentiful John Cowan
and often gaping errors, misconstruals, http://www.ccil.org/~cowan
and disinformation in your posts http://www.reutershealth.com
through sheer volume — that is another cowan@ccil.org
misconception." --Mike to Peter



posts: 149

Jorge Llamb?as scripsit:

> I think this last one gives an excellent argument for going
> to a rule similar to what pc proposes: "An unstressed cmavo will
> *always* fall off unless followed by a non-permissible initial
> cluster."

So, e.g., lemlo'i now has to be lemylo'i so as not to be read as le mlo'i?
I can live with that on the grounds of simplicity, but I don't much like
having to lengthen simple lujvo so as to promote short fu'ivla.

> We have two sets of issues to deal with here:
> (1) What syllables are allowed?
> (1.1) Do we allow consonant(s)+glide in the onset?

I should like to factor this one:

(1.1.1) Do we allow CI in the onset?
(1.1.2) Do we allow affricate CCI in the onset?
(1.1.3) Do we allow non-affricate CCI in the onset?
(1.1.4) Do we allow CCCI in the onset?

> (1.2) Do we allow syllabic+consonant in the coda?

This seems ill-formed to me; a syllabic consonant is per definitionem the
nucleus of its own syllable, not part of the coda of some other syllable.
I think it should be reformulated as: Do we allow a syllabic consonant
to have a coda (not counting a syllabic consonant at the end of a name)?

> (2) What syllable-syllable restrictions are there?
> (2.1) Do we allow a syllable with zero coda to be followed by
> one with zero onset?
> (2.2) Do we allow a syllable that ends in a diphthong to be followed
> by one that starts with a glide?
>
> My current inclination is to answer "Yes" to 1.1, "Maybe" to 1.2
> and "No" to 2.1 and 2.2

Do you mean to answer "Yes" to all of 1.1.x? I answer No to
all of them, but increasingly more fervently as the number goes up.

I agree with the answer "Maybe" to 1.2 as reformulated, and "No" to 2.x.

--
Is not a patron, my Lord Chesterfield, John Cowan
one who looks with unconcern on a man http://www.ccil.org/~cowan
struggling for life in the water, and when http://www.reutershealth.com
he has reached ground encumbers him with help? cowan@ccil.org
--Samuel Johnson



posts: 149

John Cowan scripsit:

> I think it should be reformulated as: Do we allow a syllabic consonant
> to have a coda (not counting a syllabic consonant at the end of a name)?

And, come to think of it, "Do we allow a syllabic consonant not to have
an onset (initially? non-initially?) Currently, e.g. rtilma is not
a valid fu'ivla: should it be?

So it's:

1.2.1: Do we allow a syllabic consonant to have a coda?
1.2.2.1: Do we allow a syllabic consonant not to have an onset initially?
1.2.2.2: Do we allow a syllabic consonant not to have an onset non-initially?

I say: maybe, maybe, no.

--
"The Unicode Standard does not encode John Cowan
idiosyncratic, personal, novel, or private http://www.ccil.org/~cowan
use characters, nor does it encode logos http://www.reutershealth.com
or graphics." cowan@ccil.org



posts: 1912


> Jorge Llamb?as scripsit:
>
> > I think this last one gives an excellent argument for going
> > to a rule similar to what pc proposes: "An unstressed cmavo will
> > *always* fall off unless followed by a non-permissible initial
> > cluster."
>
> So, e.g., lemlo'i now has to be lemylo'i so as not to be read as le mlo'i?

That would be the result, yes.

> I can live with that on the grounds of simplicity, but I don't much like
> having to lengthen simple lujvo so as to promote short fu'ivla.

The goal would be to simplify the rules, yes. A side effect would be
that some more fu'ivla (those of slinku'i form) would be allowed.
But promoting such fu'ivla is not really the motivation for this.

> > We have two sets of issues to deal with here:
> > (1) What syllables are allowed?
> > (1.1) Do we allow consonant(s)+glide in the onset?
>
> I should like to factor this one:
>
> (1.1.1) Do we allow CI in the onset?
> (1.1.2) Do we allow affricate CCI in the onset?
> (1.1.3) Do we allow non-affricate CCI in the onset?
> (1.1.4) Do we allow CCCI in the onset?

Good idea. I'd say:

1.1.1 Yes
1.1.2 Maybe
1.1.3 Maybe
1.1.4 No

1.1.1 would allow {pier}, {san,xiyn} and {nit,cion}.

> > (1.2) Do we allow syllabic+consonant in the coda?
>
> This seems ill-formed to me; a syllabic consonant is per definitionem the
> nucleus of its own syllable, not part of the coda of some other syllable.

I should have said "Do we allow RC as the coda, where R is one of
{l, m, n, r}?" As in {mark}, {elv}, {sing}, {imp}.
I didn't mean to say that the R would be pronounced as a separate
syllable.

> I think it should be reformulated as: Do we allow a syllabic consonant
> to have a coda (not counting a syllabic consonant at the end of a name)?
....
> And, come to think of it, "Do we allow a syllabic consonant not to have
> an onset (initially? non-initially?) Currently, e.g. rtilma is not
> a valid fu'ivla: should it be?

({mrtilma} is also not valid though, because consonantal syllables
are not allowed initially, with or without onset.)

> So it's:
>
> 1.2.1: Do we allow a syllabic consonant to have a coda?
> 1.2.2.1: Do we allow a syllabic consonant not to have an onset initially?
> 1.2.2.2: Do we allow a syllabic consonant not to have an onset non-initially?
>
> I say: maybe, maybe, no.

I don't think syllabic consonants can be allowed initially in fuhivla,
otherwise some type-III's will break down, so the initial distinction
is not really relevant.


> > (2) What syllable-syllable restrictions are there?
> > (2.1) Do we allow a syllable with zero coda to be followed by
> > one with zero onset?
> > (2.2) Do we allow a syllable that ends in a diphthong to be followed
> > by one that starts with a glide?
> >
> > My current inclination is to answer "Yes" to 1.1, "Maybe" to 1.2
> > and "No" to 2.1 and 2.2
>
> Do you mean to answer "Yes" to all of 1.1.x? I answer No to
> all of them, but increasingly more fervently as the number goes up.
>
> I agree with the answer "Maybe" to 1.2 as reformulated, and "No" to 2.x.

To summarize:

1- Intra-syllabic issues:
1.1 Onsets
1.1.1 CI ?
1.1.2 TSI ? (where TS is one of {ts, tc, dj, dz})
1.1.3 CCI ?
1.1.4 CCCI ?
1.1.5 Zero onset for consonantal syllables?
1.2 Codas
1.2.1 RC ? (where R is one of {l, m, n, r})
1.2.2 C for consonantal syllables?

2- Inter-syllabic issues:
2.1 zero coda + zero onset ?
2.2 final diphthong + initial glide?

xorxes: yes, maybe, maybe, no, no, maybe, no, no, no
djan: no, No, NO, NO!, no, ?, maybe, no, no

mu'o mi'e xorxes





__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > I have an audio-visual memory (somewhat more
> > reliable than a merely factual one) of an
> early
> > Logfest (Tommy and Athelstan present)
> discussing
> > vowel-vowel contact across syllables.
> Keeping
> > them out seemed to be winning because 1) same
> > vowel pairs easily flowed into single vowels
> but
> > "a little bit longer" and the single vowels
> > simply, 2) dissimilar pairs easily developed
> (in
> > performance or perception) either a) a glide,
> b)
> > a voiceless transition that was perceived
> (and
> > probably pronounced) as the appropriate /h/
> or c)
> > (to avoid the first two) a glottal stop.
>
> None of those things seem to happen in Spanish
> with
> different vowel-vowel pairs. For example "fea"
> is /fea/
> with no hint of an intervening i glide as some
> English
> speakers would tend to introduce, nor any
> voiceless
> transition or glottal stop. In fact, the
> opposite effect
> occurs in some dialects, where intervovalic
> b/d/g tends
> to disappear, so that "lado" ends up as /lao/.

I knoow that Ladefoged and Fromkin did some
studies on V-V transitions (I ran some of the
tapes and was a subject as well) someetime in the
60's. I don't know whether or when and where
these were published. the subjects were native
speakers of at least three Phillipine languages
as well as Japanese, several varieties of
English, and, I think, most major European
languages (I'm pretty sure about Hungarian and
Spanish). I don't know the all results in
detail, but I do know that
accent played a significant role, that
transitions and glottla stops were less likely
when there was an accent (in Lojban this is only
stress) difference between the two vowels. This
probably varied also with which vowels are
involved as well as on native habits (American
English tends to make diphthongs wherever
possible, it's said, for example). As for the
Spanish cases, those do seem all to involve
accent differences and I, with English ears, tend
to here a bridge (h or a diphthong) in the cases
of missing consonants (there is probably some
sound spectrograph data on that to show whether
this expectation prejudice or physical reality).

> So I wonder if this tendency of vowel pairs to
> develop
> something between them is a universal tendency
> or just
> a tendency for English and maybe some other
> languages.

L&F's data might shed considereable light, since
at least four major language families were
represented (pretty much regardless of your views
on Nostratic).

> Doubled vowels "ee" and "oo" also occur in
> Spanish
> ("leer", "coordinar"). In rapid speech they may
> tend
> to be pronounced as a single vowel, but they
> can be
> made distinct in careful pronounciation.

I'm assuming — not remembering in the latter
case — that these are le+-er and co-o+rd-in-ar.

> > Since
> > such things never occur in rafsi, they do not
> > occur in gismu or lujvo, meaning they are
> needed
> > only in cmene and fuhivla, the most easily
> molded
> > to Lojban standards. Apparently the decision
> > unltimately went the other way and both the
> glide
> > and the /h/ but not the glottal stop version
> of
> > comma is allowed in CLL, apparently in
> contrast
> > to ordinary occurrences of /'/ and glides in
> > diphthongs (which can apparently also occur
> in
> > direct contact with another vowel). This can
> > presumably all be sorted out in the written
> > langauge so that an slicing algorithm will
> work,
> > but it seems to present some problems for a
> lexer
> > at a later point: what exactly are spoken
> /eia/,
> > /eha/, /ea/ and /eiia/?
>
> Not sure what the question is.
>
> {e'a}, {ea}, {eia} and {eiia} are all
> different. For me,
> only the last two could be confused, but I know
> that
> will vary a lot from person to person.
> Currently the PEG
> should allow {e'a} and {eia} and disallow {ea}
> and
> {eiia}, because it does not admit
> nucleus-nucleus (e-a)
> nor diphthong-glide (ei-ia), but nucleus-glide
> (e-ia) is
> accepted. Nothing is definite yet, though.
>
The point is that these are all reasonable
pronunciations of more than one thing in
intention: /ea/, a V-V, might appear in the
speech stream as any of the first three, while
either of the other two is also a legitimate
Lojban syllable contanct, apparently. /eia/ in
either syllabification might be /eiia/. Now, as
it turns out, in this case (maybe I should find a
better case, but that require getting the rules
worked out, which is what I am trying to do), the
mabiguity is not there, since some forms — the
ones without transiton or with too much
transition — are disallowed. So, I guess the
question now is, is there a pair of vowels that
allows all of V-V, V'V, VIV (and maybe VV, if one
is /i/ or /u/)? Perhaps it is enough to know that
there is no vowel pair that allows both V-V and
one of others. /ui/ seems to be a case, since we
have {nu,iork} but also {ui} (but this is
resolved by another restriction) and presumably
/nuhiork/ somehow (probably another restriction
here, but not one that saves the point). In any
case, I still think that the argument against
simple V-V is fairly solid and I have not seen or
heard an answering reason for allowing them other
than that we have in the past rather unthinkingly
(I think) and now have a mass of cases in the
language.



posts: 2388

Just to toss my oar in, too.
xorxes:
cowan:
xorxes:
>
> > > We have two sets of issues to deal with
> here:
> > > (1) What syllables are allowed?
> > > (1.1) Do we allow consonant(s)+glide in the
> onset?
> >
> > I should like to factor this one:
> >
> > (1.1.1) Do we allow CI in the onset?
> > (1.1.2) Do we allow affricate CCI in the
> onset?
> > (1.1.3) Do we allow non-affricate CCI in the
> onset?
> > (1.1.4) Do we allow CCCI in the onset?
>

pc:
1.1.1; It depends on the C; I find the
palatalization problem significant enought to
want to steer clear of the cases (though I do not
think it extends to /k,g,x/, only the dentals and
already palatals). I don't really see the
problems with /u/ in the same way.
1.1.2. Already covered by 1: allow /u/ but not
/i/.
1.1.3 Yes, with the 1.1.1 restrictions
1.1.1 Hmmm! I'd say no, since that makes — for
my English mouth an initial CCCC, which I can't
really do.

>
> > > (1.2) Do we allow syllabic+consonant in the
> coda?
>
> I should have said "Do we allow RC as the coda,
> where R is one of
> {l, m, n, r}?" As in {mark}, {elv}, {sing},
> {imp}.

Yes - at least in names (aand if there, why
stop?). I'd even allow RCC (having had such a
name)

> > So it's:
> >
> > 1.2.1: Do we allow a syllabic consonant to
> have a coda?

Yes

> > 1.2.2.1: Do we allow a syllabic consonant not
> to have an onset initially?

Yes — already on the books: {rl} is not /ryl/.

> > 1.2.2.2: Do we allow a syllabic consonant not
> to have an onset non-initially?

As distinct from coming after consonantal C or CC
coda from the previous syllable? that is the
basic case, so yes.

> > > (2) What syllable-syllable restrictions are
> there?
> > > (2.1) Do we allow a syllable with zero coda
> to be followed by
> > > one with zero onset?

NO (see various problems — but what about the
existing cases?)

> > > (2.2) Do we allow a syllable that ends in a
> diphthong to be followed
> > > by one that starts with a glide?

Certtainly not VI-IV when the glides are the
same, even if /'/ intervenes





John E Clifford scripsit:

> 1.1.1; It depends on the C; I find the
> palatalization problem significant enought to
> want to steer clear of the cases (though I do not
> think it extends to /k,g,x/, only the dentals and
> already palatals).

Note that palatals in English and the Romance languages descend from ki gi
as well as si ti, so they're not immune to difficulties either.

> I don't really see the
> problems with /u/ in the same way.

The worst case there is hearing the difference between labials (pa ba fa
va ma) and labialized labials (pua bua fua vua mua). Chinese speakers
can't handle these at all, and other languages have problems with them too.

I'd like to avoid getting snarled up in deciding exactly which consonants
can and cannot be followed by a glide, though.

> Yes - at least in names (aand if there, why
> stop?). I'd even allow RCC (having had such a
> name)

What is allowed at the end of a name is not yet worked out: it can presumably
be more permissive than non-final codas.

> > > 1.2.2.1: Do we allow a syllabic consonant not
> > to have an onset initially?
>
> Yes — already on the books: {rl} is not /ryl/.

But it's always been ambiguous whether the r or the l is the syllabic
nucleus. By the current proposal, it's unambiguously the l: rllll.

> > > 1.2.2.2: Do we allow a syllabic consonant not
> > to have an onset non-initially?
>
> As distinct from coming after consonantal C or CC
> coda from the previous syllable? that is the
> basic case, so yes.

Basic case, how?

> Certtainly not VI-IV when the glides are the
> same, even if /'/ intervenes

Currently 'IV is not permitted at all, because 'i quickly becomes C,
the German ich-sound, and 'u quickly becomes W, the sound of wh in
Southern American English and Scottish English. Most people can't
reliably distinguish these from S (Lojban c) and w respectively.

--
"Do I contradict myself? John Cowan
Very well then, I contradict myself. jcowan@reutershealth.com
I am large, I contain multitudes. http://www.ccil.org/~cowan
--Walt Whitman, Leaves of Grass http://www.reutershealth.com



posts: 2388



> John E Clifford scripsit:
>
> > 1.1.1; It depends on the C; I find the
> > palatalization problem significant enought to
> > want to steer clear of the cases (though I do
> not
> > think it extends to /k,g,x/, only the dentals
> and
> > already palatals).
>
> Note that palatals in English and the Romance
> languages descend from ki gi
> as well as si ti, so they're not immune to
> difficulties either.

That's why I mentioned them. While over a few
centuries the velars before /i/ do slide, I don't
see the same thing happening on sloppy
pronunciation at a single time. Of course, so
many cases have slid already that there may not
be much of a sample left to test; at the drop of
a hat all I could come up with was /kiut/.

> > I don't really see the
> > problems with /u/ in the same way.
>
> The worst case there is hearing the difference
> between labials (pa ba fa
> va ma) and labialized labials (pua bua fua vua
> mua). Chinese speakers
> can't handle these at all, and other languages
> have problems with them too.

Well, dialects vary on this and I think it may
arise from particular history, where (unless I've
got this backward)/ua/ fell to /o/ in certain
contexts (including a tone that Mandarin no
longer has? — vague memory). There is clearly
sonic difference and English seems to have the
lot, albeit not in native words. I suppose the
same principle that keeps /iV/ away from the
middle rows of consonants, might keep /uV/ away
from the labials. But I would still allow them in
other places.

> I'd like to avoid getting snarled up in
> deciding exactly which consonants
> can and cannot be followed by a glide, though.
>
> > Yes - at least in names (aand if there, why
> > stop?). I'd even allow RCC (having had such
> a
> > name)
>
> What is allowed at the end of a name is not yet
> worked out: it can presumably
> be more permissive than non-final codas.

OK, though we were talking about syllables here
-- meaning only those in brivla (and the very
special cases of cmavo)?

> > > > 1.2.2.1: Do we allow a syllabic consonant
> not
> > > to have an onset initially?
> >
> > Yes — already on the books: {rl} is not
> /ryl/.
>
> But it's always been ambiguous whether the r or
> the l is the syllabic
> nucleus. By the current proposal, it's
> unambiguously the l: rllll.


> > > > 1.2.2.2: Do we allow a syllabic consonant
> not
> > > to have an onset non-initially?
> >
> > As distinct from coming after consonantal C
> or CC
> > coda from the previous syllable? that is the
> > basic case, so yes.
>
> Basic case, how?

Damn, I'll never get all the rules right: there
are no syllabic consonants in central Lojban, so
they make no case either way. I stick with my
answer though (my name again).

> > Certtainly not VI-IV when the glides are the
> > same, even if /'/ intervenes
>
> Currently 'IV is not permitted at all, because
> 'i quickly becomes C,
> the German ich-sound, and 'u quickly becomes
> W, the sound of wh in
> Southern American English and Scottish English.
> Most people can't
> reliably distinguish these from S (Lojban c)
> and w respectively.

I would have expected /'iV/ to go to /xV/ if
anything. Too few English cases to test out,
except for /hiu/ which we can get by other means.
The /huV/ cases sound like /wh/ because they
essentially are /wh/ and I know that people do
drop that to /w/ in speech. They oughtn't, but I
suppose we have to figure on sloppiness (though
the /V'uV/ cases would be ones where the
sloppiness, if it occurred would cause the least
trouble, since /V-uV/ is presumably forbidden).

>
>
>




posts: 1912


pc:
> The /huV/ cases sound like /wh/ because they
> essentially are /wh/ and I know that people do
> drop that to /w/ in speech. They oughtn't, but I
> suppose we have to figure on sloppiness (though
> the /V'uV/ cases would be ones where the
> sloppiness, if it occurred would cause the least
> trouble, since /V-uV/ is presumably forbidden).

With my current rules, V,uV is allowed, V'uV is not.

Any IVIVIV...IVIVI type of thing is allowed, syllabified
as IV,IV,IV,..,IV,IVI.
The initial and/or final I may also be missing.

What is disallowed is any two V or any two I coming into contact.

{'} can come only after V or VI, and must be followed by V.

(I'm now wondering whether {'} shouldn't be forbidden after a
diphthong.)

mu'o mi'e xorxes





__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 1912


If we allow CI cluster as onset (with a single consonant) the
only fu'ivla in jbovlaste that fails because of this would be
{kriofla}. Unfortunately it can't be just changed to {kri'ofla}
because it's a slinku'i (unless we do away with the slinku'i
rule). It could be changed to {kriiofla}.

Affected names in jbovlaste would be {ckiipyris} and {djiotis}.

mu'o mi'e xorxes






__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



On Wednesday 02 March 2005 08:30, Jorge "Llambías" wrote:
> --- Pierre Abbat wrote:
> > For instance:
> > {noltroni'u}, {noltruni'u}. Both words are found in Alice. The first
> > means the
> > Duchess and the second the Queen. I thought one was a typo for a while.
>
> We still need to figure out how {noltrini'u}, {noltreni'u} and
> {noltrani'u} fit in the hierarchy. :-)

I think Valerie Antoine or Lorelle Young is the noltreni'u and Helen of Troy
is the noltrini'u. Who the noltrani'u is, I don't know.

phma
--
le xruki le ginxre xrixruba xu xrula cu xrani?



On Wednesday 02 March 2005 09:26, John Cowan wrote:
> And, come to think of it, "Do we allow a syllabic consonant not to have
> an onset (initially? non-initially?) Currently, e.g. rtilma is not
> a valid fu'ivla: should it be?

{rtilma} begins with a non-initial pair, so it cannot be a brivla. Only a
cmene can begin with a non-initial pair.

> So it's:
>
> 1.2.1: Do we allow a syllabic consonant to have a coda?
> 1.2.2.1: Do we allow a syllabic consonant not to have an onset initially?
> 1.2.2.2: Do we allow a syllabic consonant not to have an onset
> non-initially?
>
> I say: maybe, maybe, no.

By 1.2.1 do you mean e.g. {tar,kmp,te}? What are some words that are allowed
only if 1.2.2.1 and 1.2.2.2 are allowed?

phma
--
A man found gold and left a rope; but he who found
No gold he left did tie the rope around.



On Wednesday 02 March 2005 09:19, John Cowan wrote:
> Jorge Llamb?as scripsit:
> > I think this last one gives an excellent argument for going
> > to a rule similar to what pc proposes: "An unstressed cmavo will
> > *always* fall off unless followed by a non-permissible initial
> > cluster."
>
> So, e.g., lemlo'i now has to be lemylo'i so as not to be read as le mlo'i?
> I can live with that on the grounds of simplicity, but I don't much like
> having to lengthen simple lujvo so as to promote short fu'ivla.

I am against this rule. We should not invalidate existing lujvo wordforms,
since lujvo have higher priority than fu'ivla. I am against tinkering with
the slinku'i test in general, and while I think that allowing r-hyphens where
they aren't needed is a good idea, I need to study its effect on the slinku'i
test and the word breaking algorithm in general, which I won't have time to
do for at least another month, because I'm moving.

phma
--
Maintenant, j'ai besoin d'une loupe pour trouver mes lunettes!
-Les Perles de la médecine



posts: 14214

On Wed, Mar 02, 2005 at 12:58:28AM -0500, Bob LeChevalier wrote:
> Robin Lee Powell wrote:
> > Subject: WikiDiscuss Re: PEG Morphology Algorithm From:
> > Robin Lee Powell <rlpowell@digitalkingdom.org> To:
> > wikidiscuss-list@lojban.org
>
> So it wasn't specifically to Nora or me as I think you claimed.

I claimed nothing of the sort. It was a reply to a mail you had
sent, therefore I assumed you had read it.

Once again, if your mail program does not make it obvious which
mails are replies to mails you have sent, it's a piece of crap, and
you need to get another one.

> >>Nora normally reads email once a week, and has been skipping all
> >>byfy traffic because you told her that she doesn't need to read
> >>the discussions - only the proposals.
> >
> >I assume this must be hyperbole, because she's sent 4 mails to it
> >in the last week, and I can't believe she would be rude enough to
> >ignore a forum that she had posted to.
>
> Alas, you'll need to believe it, because she doesn't have nearly
> enough time to read the traffic even on that one thread more
> often. If not having time is rude, then I guess she is rude.

When people take the time to reply to her, not reading their replies
is rude.

See above WRT not knowing which mails are replies to what you wrote.

> >>I knew that you had been working on the PEG Morphology algorithm
> >>last fall,
> >
> >That was the grammar, actually.
>
> I thought for most of the last few months that you had decided to
> add a morpher as a front end to your parser which is what I think
> jbofi'e does.

"last few months", to me, is ~December to now. Last fall ends at
about the first of October. So we were talking about different time
periods.

Yes, I have (or, rather, xorxes has) added a morphology front end to
my parser. My parser includesa pure PEG Lojban grammar, which I
fully intend as a replacement to the YACC grammar, which is wrong,
broken, and incomplete.

> >>but my understanding was that this was just another project and
> >>not part of the byfy - we ignored it just like we've ignored
> >>valfendi - it was not official.
> >
> >Perhaps you didn't see me, jcowan, and xorxes all saying (on the
> >main list) that we intended the PEG grammar to replace the YACC
> >grammar?
>
> I just did a search on "PEG" or "morphology" and found only 2
> messages by you mentioning PEG grammar last summer, and none by
> Cowan.

IT WAS A REPLY TO YOU, DAMMIT!

If you're not going to read e-mails in reply to *your* e-mails, you
need to put that in your .sig or something. That is the height of
rudeness.

(part of) The e-mail in question:

Subject: lojban Re: Official parser and "lo ni'a zu crino"
From: jcowan@reutershealth.com
To: Bob LeChevalier <lojbab@lojban.org>, rlpowell@digitalkingdom.org
Cc: lojban@yahoogroups.com
Date: Fri, 9 Apr 2004 01:10:24 -0400

Bob LeChevalier scripsit:

> I realized that, but wanted to point out what investigation showed if it
> helps in any attempts Cowan may be making to identify and fix bugs in the
> official parser (which unfortunately even with bugs has to remain a
> standard unless you can prove that your alternate parser has the same
> grammar and that it is unambiguous to the same or higher degree than the
> YACC grammar)

I am not going to do any further actual work on the official parser;
using Yacc is just too bug-prone and unreliable.

Further, I am going to support the substitution of the PEG grammar as
official; not yet, but when it's further debugged and I've convinced
myself that it's equivalent.

> But I did not even know until I read this message, and based on my
> "PEG grammar" WWW lookup, that your PEG parser was using a
> different grammar from the official one,

Rather a lot, yes. It's based on the EBNF, but it's PEG, so it's
something rather different.

> and that the issues that arose weren't coming from your recoding
> the lexer.

My original version had the lexer and parser all in one grammar
file, because PEG lets you do that. Your perception that I was
re-coding the lexer only probably came about because I was only
having problems with things like SA, which I guess you saw as
lexical issues.

Later (perhaps around November?), xorxes convinced me to logically
seperate them (which I was loathe to do, for complicated reasons).

Later, xorxes took the logically seperated lexer and made it not
completely suck, and that became the PEG morphology about which much
has been sung.

> >>Magic Words meant). My email subject directory has
> >>"bpfk-announce Partial closure of Magic Words; Mor" so I did
> >>not even realize it mentioned morphology.
> >
> >Then you need to get a better mail program, as I've said to you
> >many times.
>
> I did. Subject lines in my summary window will still have a size
> limit. I use pretty large print to avoid eyestrain. 1/3 of the
> column is taken up by WikiDiscuss leaving

I don't know if that sentence was supposed to end there, but it's
funny as hell. :-)

Perhaps you could teach it to crop WikiDiscuss ?

> >>(I actually do this with all tiki traffic, since I haven't got
> >>the skill with filtering to separate tiki stuff from byfy stuff.
> >>Most of my email reading is still tossing out hundreds of spam
> >>messages, and then skimming and binning whatever is left - you
> >>should know by now that I do respond to your emails directly to
> >>me, and that if I don't respond, it is probably because I
> >>somehow did not see it.)
> >
> >All true. I'm pretty sure that with a bit of work on my part, we
> >could get your spam down to near zero, but that would require you
> >pulling your mail from my server, which you've consistently
> >refused to do in the past.
>
> I didn't "refuse" to do anything. There was a reason why I
> switched off (can't remember now) and I wasn't aware of any
> particular desire on your part to have me switch back.

I would like to do Absolutely Anything I reasonably can to help you
have more free time. Just let me know.

> >>So what? When did something being discussed on the general wiki
> >>become a byfy proposal?
> >
> >When the first edit or discuss post occured to
> >
> >http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+Formal+Morphology
> >
> >(which appears to be wrong; sometimes Tiki loses history)
> >
> >or the creation of
> >
> >http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+PEG+Morphology+Algorithm
> >
> >which appears to have been 17 Dec 2004.
>
> I see an email with the Formal morphology page, but there was no
> followup after it was created.

That is true.

What happened, apparently, is that we were having informal
morphology discussions, and then the BFPK sections were created, but
discussion was never moved, for which I apologize. Inertia, I
didn't notice, my bad.

> >>>In the past three years, and most especially in the last six
> >>>months or so, work on this committee has proceeded this far
> >>>without input,
> >>
> >>There has been no "committee" that Nora knew of
> >
> >I'm sorry; BPFK commission, my bad.
> >
> >http://groups.yahoo.com/group/lojban/message/19146
>
> "Nora will not begin work on the commission until she has
> discharged her duties as secretary of the LLG" - something that
> wasn't done until we had completed QB turnover about a year ago.

And my mail was after that, and I never saw a response.

Furthermore, I was under the impression (perhaps erroneously) that
Pierre had also tried to communicate with her and failed.

snip my mail to Nora
> If you want an answer from her, and don't cc me, it probably won't
> get seen. Back in November, I doubt that she read any mail at
> all.

Umm, yeah.

So after all this back and forth about what I did or did not send to
Nora, you tell me that it doesn't matter, because unless I send it
to you she won't read it.

You're asking me to keep track of Nora's e-mail issues. That's not
my job. I'll assume Nora is not accessible by e-mail, ever, until I
hear otherwise.

> >>>>you gave us 2 weeks notice for a vote
> >>>
> >>>As I've said several times in wikidiscuss, I thought we were
> >>>done because the people who were actually doing the work had
> >>>mostly stopped arguing.
> >>
> >>It was never moved from the general wiki discussion into
> >>something that the byfy should take notice of until 2/8 or 2/11.
> >
> >Was there something confusing about "BPFK Section" in the subject
> >line?
>
> Look at the PEG Morphology Algorithm discussion thread, at least
> as it appears in email - no mention of BPFK.

Yes, you are *absolutely* correct. My bad.

> >>>>on the PEG morphology algorithm, which raises issues about the
> >>>>morphology as a whole,
> >>>
> >>>The intention is that the PEG algorithm *is* the morphology. I
> >>>thought this was quite clear.
> >>
> >>The current baseline description of the phonology and morphology
> >>is Chapters 2 and 3 of CLL.
> >
> >And we've found, what, two or three major internal contradictions
> >so far?
>
> I dunno. I don't recall much response to Nora's posting of the
> relevant sections.

Umm. *I* saw quite a lot, but as we apparently can't rely on her

  • reading* any of the replies, I don't know where that leaves us.


> Since the first step in a byfy effort should be to clarify the
> issues that warrant any change to the baseline, I would have
> wished there to be a byfy page that identified these.

http://www.lojban.org/tiki/tiki-index.php?page=Controversial+points+in+the+morphology

> >>I would expect that a replacement baseline for what is in CLL
> >>would be written in the sort of language that CLL is written in,
> >>not in something EBNFish.
> >
> >Then you and I have an incontrovertable difference of opinion.
>
> CLL is the baseline. A change to the baseline is a change to CLL.

And the YACC is in the CLL. Your point?

> >>That means that what remains is to prove that whatever the PEG
> >>grammar produces happens to be the breakdown that a human being
> >>thinks it should be.)
> >
> >Yes. I have (as I've posted on the main list) ~40 thousand lines
> >of test cases trying to demonstrate this.
>
> The main list? I assume you mean "Lojban List"

Correct.

> I recall your mentioning something on this wikidiscuss list. As I
> noted above, I don't see ANY posts about morphology on Lojban List
> by you. But my searching could be flaky.

Once again, at least one mail where I mentioned this was in direct
reply to you.

It's starting to seem like I simply need to assume that you and Nora
don't read e-mail, and that anything important requires a phone
call.

The mail:

Subject: lojban Re: Official parser and "lo ni'a zu crino"
From: Robin Lee Powell <rlpowell@digitalkingdom.org>
To: lojban-list@lojban.org
Date: Fri, 9 Apr 2004 13:15:38 -0700

On Thu, Apr 08, 2004 at 08:45:14PM -0400, Bob LeChevalier wrote:
the official parser
> (which unfortunately even with bugs has to remain a standard unless
> you can prove that your alternate parser has the same grammar

That's impossible. For one thing, I'm fairly certain that proving
equivalence of two CFGs is equivalent to the halting problem. For
another, the current 'grammar' isn't formalized at all in many major
respects (the pre-processing and elidable terminators), so there's
nothing to write a proof against.

What I *am* doing is tens of thousands of lines of test cases intended
to *demonstrate* the equivalence since, as I said, proof is impossible.

> and that it is unambiguous to the same or higher degree than the YACC
> grammar)

Higher. *Much* higher. CFGs are ambiguous by nature, PEGs are
unambiguous by nature. The proofs for the latter are available online,
but as there is, definitionally, only one possible reading for a PEG
against a given string a proof shouldn't even be neede.

> (Note BTW that Nora's program is highly sensitive to any little
> grammar changes, and Nora's program needs the output that the official
> parser puts out with -t in order to work; I don't know if you are
> planning a similar output format, but I hereby request it).

I'm sorry, which program are you referring to?

I'm certainly planning to output a parse tree of some kind. I can
easily make something *like* the -t output, but as I don't understand
how -t works I'm unwilling to guarantee a perfect match at this time.

-Robin

Note, btw, that my parser does, as promised, output rather extensive
parse trees, and they are quite pretty.

Example:

text
sentence
|- CMAVO
| KOhA: mi
|- BRIVLA
gismu: klama

> >Of course. That is exactly the process that we're trying to go
> >through, with the added complication that the human language
> >morphology is known to be broken.
>
> Other than the la/doi in names issue, I don't recall seeing any
> issues raised until recently. Certainly not in the main lojban
> list.

No, they were raised in wikidiscuss. The one that I'm most bothered
by is that chatper 3 of the CLL says that Lobjan cannot have
clusters more than 3 consonants long, but chapter 4 gives at least
one 4-length example word (cidjrpitsa).

> >>(I've toyed with the idea of going off and writing a few
> >>thousand words on Lojban as I speak it as a possible
> >>counterweight to xorxes zei bangu - probably my long
> >>procrastinated Arabian Nights translation - but that would
> >>require me to entirely tune out byfy, and I'm sure that
> >>everything I produce will end up garbage once byfy finishes
> >>changing the language beyond my recognition, which is my fear as
> >>to what will happen if I do anything other than what I am
> >>doing).
> >
> >I can imagine very few things that would make me happier at this
> >stage of the game.
>
> But if I do that, I will not be participating in byfy even to the
> limited extent I do now. I simply can't handle more than a few
> Lojban messages a day, and I get dozens and no way to prioritize
> them other than perhaps to read Cowan's and ignore the rest.

That actually strikes me as an *excellent* prioritization method.
In fact, it's the one I'm using myself WRT the morphology thread
(which I am deliberately staying out of).

-Robin



posts: 162

Jorge Llambas wrote:
> --- John Cowan wrote:
>
>>Jorge Llamb?as scripsit:
>>
>>
>>>I think this last one gives an excellent argument for going
>>>to a rule similar to what pc proposes: "An unstressed cmavo will
>>>*always* fall off unless followed by a non-permissible initial
>>>cluster."
>>
>>So, e.g., lemlo'i now has to be lemylo'i so as not to be read as le mlo'i?
>
>
> That would be the result, yes.

Nora poses:
What happens to "saskyprenu"?
No fix we can see to keep sa from falling off.

lojbab





posts: 2388



> Jorge Llambías wrote:
> > --- John Cowan wrote:
> >
> >>Jorge Llamb?as scripsit:
> >>
> >>
> >>>I think this last one gives an excellent
> argument for going
> >>>to a rule similar to what pc proposes: "An
> unstressed cmavo will
> >>>*always* fall off unless followed by a
> non-permissible initial
> >>>cluster."
> >>
> >>So, e.g., lemlo'i now has to be lemylo'i so
> as not to be read as le mlo'i?
> >
> >
> > That would be the result, yes.
>
> Nora poses:
> What happens to "saskyprenu"?
> No fix we can see to keep sa from falling off.

Nice one! The rule I wrote explicitly does not
put a hyphen in in that case, because CVCCy is a
initial test case, indeed the first or second
one. But, of course, /sky/ might be an initial
in a fuhivla under some rules (and I frankly have
lost traack of what the rules — if any --
actually are), in which case this would be a
slinkuhi-ish sort. I suppose the thing to do --
against general problems — would be not to all
CCy initially but I don't think that is in
anybody's rule yet (but I am likely wrong on
that). Or we could take the exclusion off the
rule about first but non-initial CCs.



Bob LeChevalier scripsit:

> Nora poses:
> What happens to "saskyprenu"?
> No fix we can see to keep sa from falling off.

If sa fell off, you'd have skyprenu, but that's not a fu'ivla, because
fu'ivla can't contain "y" anywhere.

--
"You're a brave man! Go and break through the John Cowan
lines, and remember while you're out there jcowan@reutershealth.com
risking life and limb through shot and shell, www.ccil.org/~cowan
we'll be in here thinking what a sucker you are!" www.reutershealth.com
--Rufus T. Firefly



Pierre Abbat scripsit:

> > 1.2.1: Do we allow a syllabic consonant to have a coda?
> > 1.2.2.1: Do we allow a syllabic consonant not to have an onset initially?
> > 1.2.2.2: Do we allow a syllabic consonant not to have an onset
> > non-initially?
> >
> > I say: maybe, maybe, no.
>
> By 1.2.1 do you mean e.g. {tar,kmp,te}?

Yes.

> What are some words that are allowed only if 1.2.2.1 and 1.2.2.2
> are allowed?

1.2.2.1: "rtilmor.", "lfragas."

1.2.2.2: "pikng.", "stoplk."

--
John Cowan jcowan@reutershealth.com www.ccil.org/~cowan www.reutershealth.com
And now here I was, in a country where a right to say how the country should
be governed was restricted to six persons in each thousand of its population.
For the nine hundred and ninety-four to express dissatisfaction with the
regnant system and propose to change it, would have made the whole six
shudder as one man, it would have been so disloyal, so dishonorable, such
putrid black treason. --Mark Twain's Connecticut Yankee



John E Clifford scripsit:

> That's why I mentioned them. While over a few
> centuries the velars before /i/ do slide, I don't
> see the same thing happening on sloppy
> pronunciation at a single time. Of course, so
> many cases have slid already that there may not
> be much of a sample left to test; at the drop of
> a hat all I could come up with was /kiut/.

/iu/ is a special case in English, though. Note that we already
have tooz-, tyooz-, and chooz- dialects in the word "Tuesday".

> Well, dialects vary on this and I think it may
> arise from particular history, where (unless I've
> got this backward)/ua/ fell to /o/ in certain
> contexts

Mandarin /o/ is underlyingly /w@/.

> I suppose the
> same principle that keeps /iV/ away from the
> middle rows of consonants, might keep /uV/ away
> from the labials. But I would still allow them in
> other places.

That's where I get antsy, having special rules allowing
some consonants but not others before glides.

> OK, though we were talking about syllables here
> — meaning only those in brivla (and the very
> special cases of cmavo)?

No, when I talk of syllables, I mean any kind of syllables with the
exception of the final syllables of cmene, which are very special
in the language.

> Damn, I'll never get all the rules right: there
> are no syllabic consonants in central Lojban, so
> they make no case either way. I stick with my
> answer though (my name again).

It's the final syllable of a cmene.

--
On the Semantic Web, it's too hard to prove John Cowan jcowan@reutershealth.com
you're not a dog. --Bill de hOra http://www.ccil.org/~cowan



posts: 1912


> Nora poses:
> What happens to "saskyprenu"?
> No fix we can see to keep sa from falling off.

skyprenu is not a word, so {sa} can't fall off there.

sa won't fall off from {saxophone} either, because {xophone}
is a non-lojban word. And it won't fall off from {sabinas}
because {binas} is a cmene.

A cmavo can fall off only if what remains is a non-cmene
lojban word or words.

The idea of the rule would be to define brivla wihout
making any reference to rafsi. rafsi would come in at a
later stage, when you try to figure out how a word is composed,
but they would be irrelevant to the morphology of words.

mu'o mi'e xorxes





__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 1912


> I am against this rule. We should not invalidate existing lujvo wordforms,
> since lujvo have higher priority than fu'ivla.

It's true that some lujvo would be affected. These are the ones
I found in jbovlaste under 'b':

bajbakni
bakre'u
baple'i
basme'e
bavla'i
bavlamdei
bavlamjeftu
bavlamke'u
bavlamvanji
bejmu'o
besmamta
bisma'a
bisri'e

That's 13 out of 203, so about 6.5%. Not too many or far too many,
depending on your perspective.

mu'o mi'e xorxes





__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 2388



> John E Clifford scripsit:
>
> > That's why I mentioned them. While over a
> few
> > centuries the velars before /i/ do slide, I
> don't
> > see the same thing happening on sloppy
> > pronunciation at a single time. Of course,
> so
> > many cases have slid already that there may
> not
> > be much of a sample left to test; at the drop
> of
> > a hat all I could come up with was /kiut/.
>
> /iu/ is a special case in English, though.
> Note that we already
> have tooz-, tyooz-, and chooz- dialects in the
> word "Tuesday".

It is indeed. But note that we do not have
/tcut/ or /cut/ for "cute — nor /cudj/ or /xudj/
or whatever for "huge." Still, keeping the rles
simple would say no CIV altogether, and these do
not seem to be what is diminishing the fuhivla
space (if we are worried about its size).

> > Well, dialects vary on this and I think it
> may
> > arise from particular history, where (unless
> I've
> > got this backward)/ua/ fell to /o/ in certain
> > contexts
>
> Mandarin /o/ is underlyingly /w@/.
@ is schwa? Forrest has that giving /u/, and
/ua/ giving /o/ after labials. But that is 30
years old and I don't keep up on Chinese
phonology. But the general point remains, the
simplest rules is to deny CIV

> > I suppose the
> > same principle that keeps /iV/ away from the
> > middle rows of consonants, might keep /uV/
> away
> > from the labials. But I would still allow
> them in
> > other places.
>
> That's where I get antsy, having special rules
> allowing
> some consonants but not others before glides.
>
> > OK, though we were talking about syllables
> here
> > — meaning only those in brivla (and the very
> > special cases of cmavo)?
>
> No, when I talk of syllables, I mean any kind
> of syllables with the
> exception of the final syllables of cmene,
> which are very special
> in the language.

Aren't cmene final just general sylklables except
those that end in vowels. That would be the
simplest rule. What would be the desirable
deviations from this?

> > Damn, I'll never get all the rules right:
> there
> > are no syllabic consonants in central Lojban,
> so
> > they make no case either way. I stick with
> my
> > answer though (my name again).
>
> It's the final syllable of a cmene.

See above.



John E Clifford scripsit:

> Aren't cmene final just general syllables except
> those that end in vowels. That would be the
> simplest rule. What would be the desirable
> deviations from this?

The point is that "parks." may be a good cmene but "parksfrog." is probably not.
I am willing to consider complex codas on final syllables (which necessarily
appear only in cmene), but I firmly believe that non-final syllables should
have simple codas (plain C) or none at all.

--
After fixing the Y2K bug in an application: John Cowan
WELCOME TO <censored> jcowan@reutershealth.com
DATE: MONDAK, JANUARK 1, 1900 http://www.ccil.org/~cowan



posts: 1912


> I am willing to consider complex codas on final syllables (which necessarily
> appear only in cmene), but I firmly believe that non-final syllables should
> have simple codas (plain C) or none at all.

I agree. I would prefer to allow only simple codas in names,
but I guess that's too strict.

How complex a coda should a final cmene syllable be allowed to have?
A restricted set of CC's? Any CC? A restricted set of CCC's? Any CCC?
Presumably no more than CCC. What about consonantal syllables?
Can they take a coda when they are at the end of a cmene? If so,
how heavy?

mu'o mi'e xorxes



__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com



posts: 1912


Of 67 names in jbovlaste 60 end with a single consonant.
The 7 that end with more than one consonant are:

danmark
irk
island
mors
nederland
paludizm
xrvatsk

If we allow final -RC coda in names, where R is one of {l, m, n, r},
that takes care of 5 of the 7: danmark, irk, island, mors and nederland.

If we want to allow paludizm, I think we might as well
allow any final CC.

I think we should change xrvatsk to xrvatskas, which doesn't require
any special syllables: xr,vat,skas.

mu'o mi'e xorxes





__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 1912


> If we allow final -RC coda in names, where R is one of {l, m, n, r},
> that takes care of 5 of the 7: danmark, irk, island, mors and nederland.
>
> If we want to allow paludizm, I think we might as well
> allow any final CC.

I just realized that zm is a consonantal syllable, so {pa,lu,di,zm}
does not require special treatment after all.

That only leaves {xrvatsk} as the odd one out.

mu'o mi'e xorxes







__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 2388



> John E Clifford scripsit:
>
> > Aren't cmene final just general syllables
> except
> > those that end in vowels. That would be the
> > simplest rule. What would be the desirable
> > deviations from this?
>
> The point is that "parks." may be a good cmene
> but "parksfrog." is probably not.
> I am willing to consider complex codas on final
> syllables (which necessarily
> appear only in cmene), but I firmly believe
> that non-final syllables should
> have simple codas (plain C) or none at all.
>
Point taken, though I was thinking of "Clifford"
/kli|efrd/ at the time. The principles of short
easy rules and that of distinctive presentation
of borrowed words are having some conflict. I
suppose simple rules will win.
>
>




posts: 2388


wrote:

>
> --- John Cowan wrote:
> > I am willing to consider complex codas on
> final syllables (which necessarily
> > appear only in cmene), but I firmly believe
> that non-final syllables should
> > have simple codas (plain C) or none at all.
>
> I agree. I would prefer to allow only simple
> codas in names,
> but I guess that's too strict.
>
> How complex a coda should a final cmene
> syllable be allowed to have?
> A restricted set of CC's? Any CC? A restricted
> set of CCC's? Any CCC?
> Presumably no more than CCC. What about
> consonantal syllables?
> Can they take a coda when they are at the end
> of a cmene? If so,
> how heavy?
>

Final cmene syllable codas don't have to be "any
CCC" since most such never occur in ordinary
language. On the other hand, it is a little hard
to tell in advance what is possible Certainly the
R-stop-sibilant (mirrors of onsets) gets in and,
at the other end, R more or less alone does (but
that depends on what can be a coda in the
previous syllable: how does /parkr/ divide in
Lojban? What happens with curiosities like /brg/
alone or at the end of some longer bit? And /brks/?



posts: 2388


wrote:

>
> Of 67 names in jbovlaste 60 end with a single
> consonant.
> The 7 that end with more than one consonant
> are:
>
> danmark
> irk
> island
> mors
> nederland
> paludizm
> xrvatsk
>
> If we allow final -RC coda in names, where R is
> one of {l, m, n, r},
> that takes care of 5 of the 7: danmark, irk,
> island, mors and nederland.
>
> If we want to allow paludizm, I think we might
> as well
> allow any final CC.
>
> I think we should change xrvatsk to xrvatskas,
> which doesn't require
> any special syllables: xr,vat,skas.
>
My memory is that /xravatsk/ was presnted as a
pretty good version of the original, so it would
be a shame to lose — unless it made the rules
way too awful. Nice to see the syllabicity of
/z/ coming up; note that in at least
paralinguistic expressions /s/ and /f/ (and I
think the other sibilants and /v/) also can take
on this role.
But remember, I would favor leaving all foreign
elements as unfettered as possible, with cmene
only marginally more restrained and fuhivla only
a bit more than that.



posts: 2388


wrote:

>
> --- Jorge Llambías wrote:
> > If we allow final -RC coda in names, where R
> is one of {l, m, n, r},
> > that takes care of 5 of the 7: danmark, irk,
> island, mors and nederland.
> >
> > If we want to allow paludizm, I think we
> might as well
> > allow any final CC.
>
> I just realized that zm is a consonantal
> syllable, so {pa,lu,di,zm}
> does not require special treatment after all.
>
> That only leaves {xrvatsk} as the odd one out.
>
Oops. I was reading /z/ taking the syllabic
peak, forgetting that /m/ would take that role
first. Never mind.
Of course, my free form foreigners assumes that
the foreign elements are always set off in some
way from central Lojban. But then, they always
are already — except for fuhivla, which have to
make a show of passing a brivla (but the show
does not have to be very great).



posts: 1912


> My memory is that /xravatsk/ was presnted as a
> pretty good version of the original, so it would
> be a shame to lose — unless it made the rules
> way too awful.

Croatian Lojbanist Goran Topic used {la xrvatskas}, so
I'm not too concerned to lose {xrvatsk}. But in any case
the rules are meant to give coherent phonotactics to
Lojban words, not to permit as faithful a reproduction
of any word from any language. If the words get too
heterogeneous the language becomes more difficult to
speak, even if each borrowed word or name very faithfully
reproduces its source word.

> But remember, I would favor leaving all foreign
> elements as unfettered as possible, with cmene
> only marginally more restrained and fuhivla only
> a bit more than that.

I suppose it depends on the goal. If the goal is to reproduce
foreign sounds faithfully, then there should be no restrictions,
if the goal is to have something that sounds like a language,
then there should be some coherent phonotactics.

mu'o mi'e xorxes





__
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/



posts: 2388


wrote:

>
> --- John E Clifford wrote:
> > My memory is that /xravatsk/ was presnted as
> a
> > pretty good version of the original, so it
> would
> > be a shame to lose — unless it made the
> rules
> > way too awful.
>
> Croatian Lojbanist Goran Topic used {la
> xrvatskas}, so
> I'm not too concerned to lose {xrvatsk}. But in
> any case
> the rules are meant to give coherent
> phonotactics to
> Lojban words, not to permit as faithful a
> reproduction
> of any word from any language. If the words get
> too
> heterogeneous the language becomes more
> difficult to
> speak, even if each borrowed word or name very
> faithfully
> reproduces its source word.
>
> > But remember, I would favor leaving all
> foreign
> > elements as unfettered as possible, with
> cmene
> > only marginally more restrained and fuhivla
> only
> > a bit more than that.
>
> I suppose it depends on the goal. If the goal
> is to reproduce
> foreign sounds faithfully, then there should be
> no restrictions,
> if the goal is to have something that sounds
> like a language,
> then there should be some coherent
> phonotactics.
>
I think the tricky part is finding a balance.
English is hardly sounds less of a language for
having taken in — especially in names, but in
regular vocabulary as well — words and
expressions from scores of languages and
preserved at least semblance of their original
form, even when that involved phonotactics that
were not at all English (maybe even went against
English patterns at the time of introduction but
then led to a change of pattern). Of course,
English has been pretty heterogeneous for a long
time, so that it might be harder to speak of its
patterns and ones that are clear tend to be
toward the larger range of possibilities, so a
more diverse set can come to be included. But
there are some things that press even English but
seem to survive: words with /x/, say, or labials
followed by /uV/, or (buffered, to be sure)
doubly articulated stops that are at least
spelled right. And so on. There are stages for
Lojban, ranging from something that approximates
the original so far as just the basic phonology
goes (trying to smuggle in a theta as such seems
too much to hope for), to names, which have a bit
more ssstructure, to fuhivla that fit into brivla
patterns at least at the beginning and end. The
"purity" of Lojban is purchased at a considerable
price, as you are finding trying to get rules
that cover all the cases or even all the "right"
ones, and the proice is sometimes hard to come up
with on the fly, so a free form vent may be
useful.



Jorge Llamb��)B�as scripsit:

> How complex a coda should a final cmene syllable be allowed to have?
> A restricted set of CC's? Any CC? A restricted set of CCC's? Any CCC?

C or RC, perhaps.

> Presumably no more than CCC. What about consonantal syllables?
> Can they take a coda when they are at the end of a cmene? If so,
> how heavy?

C, I'd say.

--
John Cowan <jcowan@reutershealth.com>
http://www.ccil.org/~cowan http://www.reutershealth.com
Charles li reis, nostre emperesdre magnes,
Set anz totz pleinz ad ested in Espagnes.



I remember a word I coined which is not valid according to the syllable rules,
and I don't think it's been valid for any version of them since xorxes has
been working on them. The word is {mabrnksenartra}, which is in the taxonomy
chart but doesn't have an entry in jbovlaste because I don't know of a good
gloss (it means {snomabru ja mantyctimabru ja cakmabru}, those being the
members of the order Xenarthra). Should we allow this word?

phma
--
GCS/M d- s-: a+ C++ UL++++$ P+ L+++ E- W+++ N+ o? K? w-- O? M- V- Y++
PGP++ t- 5? X? R- !tv b++ DI !D G e++ h+>---- r- y>+++


Pierre Abbat scripsit:

> I remember a word I coined which is not valid according to the syllable rules,
> and I don't think it's been valid for any version of them since xorxes has
> been working on them. The word is {mabrnksenartra}, which is in the taxonomy
> chart but doesn't have an entry in jbovlaste because I don't know of a good
> gloss (it means {snomabru ja mantyctimabru ja cakmabru}, those being the
> members of the order Xenarthra). Should we allow this word?

I would prefer -xenartra or -zenartra; the former represents the usual
spelling, the latter, the usual pronunciation (as well as the Loglan 1 / CLL
rules for Linnaean name preprocessing). Either fits the syllable rules.

See http://www.lojban.org/publications/reference_grammar/chapter4.html#e8d14 ,
or rather the second numbered list following it.

--
I could dance with you till the cows John Cowan
come home. On second thought, I'd http://www.ccil.org/~cowan
rather dance with the cows when you http://www.reutershealth.com
came home. --Rufus T. Firefly jcowan@reutershealth.com


On 5/18/05, Pierre Abbat <phma@phma.hn.org> wrote:

> I remember a word I coined which is not valid according to the syllable rules,
> and I don't think it's been valid for any version of them since xorxes has
> been working on them. The word is {mabrnksenartra}, which is in the taxonomy
> chart but doesn't have an entry in jbovlaste because I don't know of a good
> gloss (it means {snomabru ja mantyctimabru ja cakmabru}, those being the
> members of the order Xenarthra). Should we allow this word?

There's also {tarksako} in jbovlaste that would not be allowed
with the current syllable rules.

I suppose ks, kc, gz, gj, ps, pc, bz, bj could have been allowed
as word initials in analogy with ts, tc, dz, dj. But given that they
are not allowed as word initials, it doesn't seem right to allow
them as syllable onsets. If they were allowed, they would be the
only syllable onsets not allowed word initially. ({'} doesn't count,
because {.} can be regarded as its word initial counterpart.)

Besides {tarksako}, I find four other fu'ivla in jbovlaste that
are not valid with the current rules: {kriofla}, {trueno} and
{spatr-/stagr-leoxari}. (If I didn't miss any others.)
This is because only a single consonant is currently allowed
before i/uV, so krio and true are not valid syllables, and
a nucleus is not allowed to follow another nucleus directly,
so kri,o tru,e le,o are not valid either. These could be fixed
for example as {kriiofla}, {truueno} and {-le'oxari}.

mu'o mi'e xorxes


Jorge Llambías scripsit:

> There's also {tarksako} in jbovlaste that would not be allowed
> with the current syllable rules.

+1


Where do you take the latest rules to reside?

> Besides {tarksako}, I find four other fu'ivla in jbovlaste that
> are not valid with the current rules: {kriofla}, {trueno} and
> {spatr-/stagr-leoxari}. (If I didn't miss any others.)
> This is because only a single consonant is currently allowed
> before i/uV, so krio and true are not valid syllables, and
> a nucleus is not allowed to follow another nucleus directly,
> so kri,o tru,e le,o are not valid either. These could be fixed
> for example as {kriiofla}, {truueno} and {-le'oxari}.

+1


--
John Cowan jcowan@reutershealth.com
http://www.reutershealth.com http://www.ccil.org/~cowan
Humpty Dump Dublin squeaks through his norse
Humpty Dump Dublin hath a horrible vorse
But for all his kinks English / And his irismanx brogues
Humpty Dump Dublin's grandada of all rogues. --Cousin James


On 5/20/05, John Cowan <jcowan@reutershealth.com> wrote:
> Jorge Llambías scripsit:
> > There's also {tarksako} in jbovlaste that would not be allowed
> > with the current syllable rules.
>
> +1
>
> Where do you take the latest rules to reside?

The formal rules are here:
<http://www.lojban.org/tiki/tiki-index.php?page=BPFK+Section%3A+PEG+Morphology+Algorithm>

I wrote a description of all possible syllables here:
<http://www.lojban.org/tiki/tiki-index.php?page=Controversial+points+in+the+morphology>
although that doesn't say anything about possible
syllable-syllable combinations, and it doesn't mention
cmene final syllables, which currently allow double consonant
codas when the first consonant is l/m/n/r. (cmene are also
allowed to start with a bare coda.)

mu'o mi'e xorxes


On Thursday 19 May 2005 15:37, Jorge Llambías wrote:

> On 5/18/05, Pierre Abbat <phma@phma.hn.org> wrote:

> > I remember a word I coined which is not valid according to the syllable
> > rules, and I don't think it's been valid for any version of them since
> > xorxes has been working on them. The word is {mabrnksenartra}, which is
> > in the taxonomy chart but doesn't have an entry in jbovlaste because I
> > don't know of a good gloss (it means {snomabru ja mantyctimabru ja
> > cakmabru}, those being the members of the order Xenarthra). Should we
> > allow this word?
>
> There's also {tarksako} in jbovlaste that would not be allowed
> with the current syllable rules.

At one time a syllable was allowed to end with a syllabic consonant and
another consonant (or was it a stop?), so that is syllabified {tark,sa,ko}. I
think that should be restored.

> Besides {tarksako}, I find four other fu'ivla in jbovlaste that
> are not valid with the current rules: {kriofla}, {trueno} and
> {spatr-/stagr-leoxari}. (If I didn't miss any others.)
> This is because only a single consonant is currently allowed
> before i/uV, so krio and true are not valid syllables, and
> a nucleus is not allowed to follow another nucleus directly,
> so kri,o tru,e le,o are not valid either. These could be fixed
> for example as {kriiofla}, {truueno} and {-le'oxari}.

But if "krio" is not a valid syllable, is "krii" either? I think "krio" and
"true" and "le,o" should be allowed. I generally find "ii" and "uu" in brivla
ugly (though I think they should be allowed if "ui" and "iu" are, just rare),
so {kriofla} sounds better than {kriiofla}.

ta'o lo bu'u trueno ca se xrula

mu'omi'e .pier.
--
My monthly periods happen once per year.
-Les Perles de la médecine


On 5/21/05, Pierre Abbat <phma@phma.hn.org> wrote:

> On Thursday 19 May 2005 15:37, Jorge Llambías wrote:
> > These could be fixed
> > for example as {kriiofla}, {truueno} and {-le'oxari}.
>
> But if "krio" is not a valid syllable, is "krii" either?

No, but {kri,io,fla} yes.

> I think "krio" and
> "true" and "le,o" should be allowed. I generally find "ii" and "uu" in brivla
> ugly (though I think they should be allowed if "ui" and "iu" are, just rare),
> so {kriofla} sounds better than {kriiofla}.

Neither sounds especially bad to me. I find {kri,io,fla} slightly easier
to pronounce than {krio,fla}. The problem with allowing both is that
to many people they sound too similar.

mu'o mi'e xorxes


On Saturday 21 May 2005 15:29, Jorge Llambías wrote:
> Neither sounds especially bad to me. I find {kri,io,fla} slightly easier
> to pronounce than {krio,fla}. The problem with allowing both is that
> to many people they sound too similar.

To me {kri,io,fla} sounds too similar to {kri,o,fla}, not {krio,fla}. Besides,
{iio} is {ii,o} if no commas are explicit. If we should disallow one or the
other, I'd rather disallow the longer one. The following fu'ivla in jbovlaste
have three vowels with no consonant between them: {io'imbe}, {smacrkobaiu},
{tropaiolo}. I've thought of {nimfaia} (of which the misrylatna is a species)
but haven't added it. I also thought of {uaizdo} (from Germanic "waizdo") but
rejected it in favor of {aizdo}. How about disallowing three vowels in a row,
with no consonant or apostrophe, if the first is 'i' or 'u'?

phma
--
le xruki le ginxre xrixruba xu xrula cu xrani?


On 5/22/05, Pierre Abbat <phma@phma.hn.org> wrote:

> To me {kri,io,fla} sounds too similar to {kri,o,fla}, not {krio,fla}.

Yes, that's the most similar, because it matches the number of syllables.
But no matter how you pronounce it, {krio,fla} and {kri,o,fla} would be
the same word in Lojban.

>Besides,
> {iio} is {ii,o} if no commas are explicit.

Not with the new rules. {ii,o} would be forbidden, because it would
have two nuclei
colliding. {i,io} is acceptable, because the semivowel in {io} is an onset, not
part of the nucleus.

> If we should disallow one or the
> other, I'd rather disallow the longer one.

But what would the full rules be?

> The following fu'ivla in jbovlaste
> have three vowels with no consonant between them: {io'imbe}, {smacrkobaiu},
> {tropaiolo}.

All allowed: {io,'im,be}, {sma,cr,ko,ba,iu}, {tro,pa,io,lo}. No
nuclear collisions.

> I've thought of {nimfaia} (of which the misrylatna is a species)

{nim,fa,ia} is allowed too.

> but haven't added it. I also thought of {uaizdo} (from Germanic "waizdo") but
> rejected it in favor of {aizdo}.

{uai,zdo} is also allowed.

> How about disallowing three vowels in a row,
> with no consonant or apostrophe, if the first is 'i' or 'u'?

That would be a rather different system, I'd have to think about it.

mu'o mi'e xorxes


On Sunday 22 May 2005 15:37, Jorge Llambías wrote:

> On 5/22/05, Pierre Abbat <phma@phma.hn.org> wrote:

> > The following fu'ivla in jbovlaste
> > have three vowels with no consonant between them: {io'imbe},
> > {smacrkobaiu}, {tropaiolo}.
>
> All allowed: {io,'im,be}, {sma,cr,ko,ba,iu}, {tro,pa,io,lo}. No
> nuclear collisions.

I wasn't saying there's anything wrong with those words, just looking for
words with three vowels in a row.

> > I've thought of {nimfaia} (of which the misrylatna is a species)
>
> {nim,fa,ia} is allowed too.
>
> > but haven't added it. I also thought of {uaizdo} (from Germanic "waizdo")
> > but rejected it in favor of {aizdo}.
>
> {uai,zdo} is also allowed.

Is it {ua,i,zdo} or {u,ai,zdo}? There are no triphthongs in Lojban.

phma
--
My monthly periods happen once per year.
-Les Perles de la médecine


On Sunday 22 May 2005 15:37, Jorge Llambías wrote:

> On 5/22/05, Pierre Abbat <phma@phma.hn.org> wrote:

> > The following fu'ivla in jbovlaste
> > have three vowels with no consonant between them: {io'imbe},
> > {smacrkobaiu}, {tropaiolo}.
>
> All allowed: {io,'im,be}, {sma,cr,ko,ba,iu}, {tro,pa,io,lo}. No
> nuclear collisions.

I wasn't saying there's anything wrong with those words, just looking for
words with three vowels in a row.

> > I've thought of {nimfaia} (of which the misrylatna is a species)
>
> {nim,fa,ia} is allowed too.
>
> > but haven't added it. I also thought of {uaizdo} (from Germanic "waizdo")
> > but rejected it in favor of {aizdo}.
>
> {uai,zdo} is also allowed.

Is it {ua,i,zdo} or {u,ai,zdo}? There are no triphthongs in Lojban.

phma
--
My monthly periods happen once per year.
-Les Perles de la médecine


On 5/24/05, Pierre Abbat <phma@phma.hn.org> wrote:

> On Sunday 22 May 2005 15:37, Jorge Llambías wrote:
> > {uai,zdo} is also allowed.
>
> Is it {ua,i,zdo} or {u,ai,zdo}? There are no triphthongs in Lojban.

It's {uai,zdo}: onset {u} and nucleus {ai}, a triphthong.

The proposed morphology allows any possible onset with any
possible nucleus and any possible coda. Then there are
inter-syllabic restrictions imposed, but no intra-syllabic ones.

mu'o mi'e xorxes