WikiDiscuss

WikiDiscuss


PEG Morphology Algorithm

On Wednesday 22 December 2004 19:14, Robin Lee Powell wrote:
> On Wed, Dec 22, 2004 at 04:12:08PM -0800, Robin Lee Powell wrote:
> > On Wed, Dec 22, 2004 at 06:51:36PM -0500, Pierre Abbat wrote:
> > > If camxes and valfendi give different output on an invalid
> > > string
> >
> > snip
> >
> > > that is not necessarily a bug.
> >
> > That's a good point, but in many of these cases I'm going to need
> > you guys to tell me what is and is not a valid string.
>
> For example, in this case one of you thinks it's invalid, the other
> does not:
>
> *** Sentence: muSTElaVIson 1
> MISMATCH!
> valfendi: >muSTE< -la VIson.
> pegbased: -mu (STEla) VIson.
>
> Morphologically invalid, I mean. Both cases are grammatically
> invalid.
>
> I'm pretty sure camxes is wrong on this one.

It's invalid as an encoding of {mu stela vison} because the cmene is preceded
by a brivla without a pause between them. It's invalid as an encoding of
{muste la vison} because the accent is on the wrong syllable.

{kybuladjan} is invalid because {ky} needs a pause after it. Both lexers,
however, lex this as {ky bu la djan} (or so xorxes claims for camxes). The
official rules state that the pause must be between the Cy and the next word
that isn't Cy, but I figured out that it can be between the Cy and the next
word that contains CVV, CV'V, or CCV, so I say {kybu.ladjan}.

{kymoi}, {kybumoi}, {kybumlatu}, {lekymoi}, {lekybumoi}, and {lekybumlatu} are
more phrases with the pause after the lervla missing. valfendi thinks they
all contain brivla, but errors out trying to identify it, except for {ky bu
mlatu}.

phma
--
S Fa1>+/- !TM M-- K H T-- t? AT++ SY Te- SC- FO- D P !Tz E++ L