Meeting with Polderland 31.10.2006


  • Peter Beinema
  • Thomas Omma
  • Sjur Moshagen


  • since last time
  • questions and answers

Since last time


internal discussion on providing tools:

- can't find them in the contract, but we see the advantages of a short development loop on your side. - Tools would be for unix/linux environment, we'll look into transfer to Mac: "makelex" - lexicon generator. "PSC_test" testbed for unix/linux-environment (+ stand-alone speller for unix/linux) Have to disable general-purpose usability and make it Sami-specific (and possibly time-delimited) before handing it over to you.

hyphenator:will be based on speller data (look-up hyphenation in stead of pattern hyphenation)


  • made several linguistic updates and corrections, our generated word list now exceeds the file size limit of our computer (linux, 2Gb limit).

Next: New drop of the speller as soon as the Divvun gang can deliver updated data files.

Possible issues

Speller behaviour with "non-alphabetic" chars

Example: Storasjõ (should be Storasjö)

The underlining does not include the last character, "õ", which seems counterintuitive.

Next meeting

Next Tuesday (7.11.) at the usual time.


  • add all latin characters, including diacritics, to string token set (PLD)
  • continue to improve hyphenation (Sjur and Thomas)
    • working on it
  • continue with speller data generation/conversion (Tomi)
    • still continues
    • Divvun could provide partial data in PLX format for testing and feedback purposes
  • get language codes to work with Mac Office 2004 (and check MacOffice 2007) ( Polderland)
  • deliver Lule Sámi hyphenation test data (Sjur and Børre)
    • the data is hyphenated, but it was not added to the download area.
  • send the Skolt Sámi characters to Polderland (Sjur)