Meeting with Polderland 14.11.2006


  • Peter Beinema
  • Sjur Moshagen


  • since last time
  • questions and answers

Since last time


  • generated new lexicon for north Sami, ready to roll out speller Alpha;
  • creating new type of hyphenator (integrating speller in hyphenator)
  • migrating lexicon generator + test kit to apple


  • included and constrained the use of derivations - the generated data exploded to roughly 20 Gb: -)
  • more work on the PLX data generator


  • finishing the new hyphenation engine
  • drop the next speller version (sme and smj), and the first hyphenator; this will be the official Alpha drop, and should be accompanied with an e-mail stating this.
  • try to provide the first batch of PLX encoded data

Alpha version


Both sme and smj. The sme version will be using the latest, 20Gb lexicon, if possible.


Will use only the limited data delivered to Polderland, and use the fallback algorithm for all words not in the lexicon. It will provide a nice test case for the fallback algorithm: -)

Possible issues


The section on compounding rules are specific to Dutch, but similar rules could be made for Sámi as well, if needed. The Divvun project will try to make use of the existing machinery before extensions are considered.

Next meeting

Next Tuesday (21.11.) at the usual time.


  • continue to improve hyphenation (Sjur and Thomas)
  • continue with speller data generation/conversion (Tomi)
  • get language codes to work with Mac Office 2004 (and check MacOffice 2007) ( Polderland)
  • provide the Divvun project with the PLX format specification (Polderland)
    • done
  • deliver Alpha versions (Polderland)
  • try to find proper compiler version for Adobe Indesign (old version will probably do)