WikiDiscuss

WikiDiscuss


PEG Morphology Algorithm

posts: 953

On Tue, 21 Dec 2004 wikidiscuss@lojban.org wrote:

> Re: PEG Morphology Algorithm — design
> The grammar in its current state does four separable things:

> 1. It partitions the input stream into words.
...
> 4. It categorizes brivla into gismu, lujvo and fu'ivla.

I believe that it is possible that these two tasks are not separable. I=
n=20
any case, the current approach of the morphology part of does it in a w=
ay=20
consistent with the traditional (not fully operationalized) method of=20
determining which words are of what kind.

Basically, a fu'ivla is any word that fits the definition of a brivla=20
(consonant cluster in first five letters, not counting y or '), but is =
not=20
either a gismu or a lujvo. So a fu'ivla is a very open-ended set of=20
words. When cmavo are preceding a fu'ivla, there are some potential=20
ambiguities that we have to handle. This is done via the so-called=20
"slinku'i test", which is explained at:

http://www.lojban.org/tiki/tiki-index.php?page=3Dslinku%27i

In order to do the slinku'i test, we have to know what a lujvo is like.=
To=20
know what a lujvo is like, we have to know what a rafsi is like. Final=20
rafsi can be gismu, so we have to match against that, too. So, only to=20
separate words consistently in the face of fu'ivla, we have to implemen=
t=20
all of these concepts. So I believe further modularization is not=20
possible.

--=20
Arnt Richard Johansen http://arj.nvg.org=
/
=ABN=E5r jeg kommer til kloakken, er det for =E5 rense opp - n=E5r Zola=
bes=F8ker det
samme sted, er det for =E5 bade!=BB --Henrik Ibsen