Print

Word frequency lists

 
The following is about Rob Speer's frequency lists, which have
fallen off the 'net. Some of them have been recovered and attached
here.

The word frequency lists as of 2003/4/30. Stored on a separate server.

These frequency lists are drawn from a corpus containing the contents of the lojban.org/texts directory, most of this Wiki's texts in Lojban, as many IRC logs as I could find, the texts on CVS?, and a large portion of the jbosnu? archives. I spent some time weeding out most of the English text, and tried to avoid picking up metalinguistic discussion (a word frequency list based on the main mailing list showed that lujvo is one of the most commonly used words).

The corpus, in a .tar.gz archive.(external link)

 
mi'e rab.spir

  • Utterance templates by frequency This is a sorted list of "sentence templates" excerpted from IRC. It shows which sequences of selma'o/word types are most common.

Created by rab.spir. Last Modification: Sunday 16 of November, 2008 11:22:01 GMT by arj.