WikiDiscuss

WikiDiscuss


PEG Morphology Algorithm

On Tuesday 21 December 2004 19:47, Robin Lee Powell wrote:
> On Tue, Dec 21, 2004 at 04:27:01PM -0800, wikidiscuss@lojban.org
>
> wrote:
> > Jorge, this is just marvelous work — I'm in awe. (I'm also
> > envious of the amount of free time you appear to have. :-)
> > However, I have a concern about the overall approach you're taking
> > — the high-level design, as it were.
>
> Actually, the high-level design is mine, not his. See:
>
> http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/
>
> > The grammar in its current state does four separable things:
>
> Just because they *can* be seperated, doesn't mean they should be.

They are separated in valfendi (except it doesn't do 3).

> > 1. It partitions the input stream into words.
> >
> > 2. It validates the words, rejecting invalid vowel and consonant
> > patterns.
> >
> > 3. It determines the selma'o of a cmavo.
> >
> > 4. It categorizes brivla into gismu, lujvo and fu'ivla.
>
> > For the sake of modularity and reducing point-complexity, I think
> > it would be worth considering splitting the job into its
> > components, and writing separate grammars:
>
> The problem with this is that we could argue for hours over where
> the seperations lie. I was vehemently opposed to seperating out the
> morphology from the rest of the grammar in the first place, in fact.

The problem with doing it in PEG is that it appears to be impossible to check
that a string matches two different PEs with the same number of characters
matched for both. That's why every selma'o PE ends with checking for a space
or consonant, even though "cmavo" already checked for that.

phma
--
li fi'u vu'u fi'u fi'u du li pa