meeting_2006-09-26
Meeting with Polderland 26.9.2006
Participants:
- Peter Beinema
- Thomas Omma
- Sjur Moshagen
- Tomi Pieski
Agenda
- since last time
- hyphenation data
- questions and answers
Since last time
Peter sent phonetic speller correction rules, and Thomas has been
Binary version of the speller for Mac not yet available, the old version has
Test results: the regression testing didn't cope well with the rather large
Test findings upto now: all words tested are OK, except for:
- words containing colons (":"), such as "ABC-company:ai" "ABC-company:aid", ... - words containing asterisk characters ("*"), such as "Juvvelan*gorsa" - this is a typo, the only one, it is already fixed - multi-word expressions, such as "Beakka Hánno gieddila...":
Hyphenation
# = word boundary ^ = possible hyphenation point - = hard hyphen
Hyphenation at Divvun has improved, but there are still issues. Will be
Tasks since last time
- send correction/phonetic rules to Thomas (Peter)
- done
- done
- review the processed data sets (Thomas)
- no need to do this, as the delivered data isn't split any more.
- no need to do this, as the delivered data isn't split any more.
- make speller lexicon data for Polderland, with POS, compounding properties and
- started
- started
- try to send a first binary to Sjur (Peter)
- we'll wait for the Mac version
- we'll wait for the Mac version
- check whether hyphen is used when compounding abbreviations (TV-stuolla)
- yes, hyphen
TODO:
- continue to improve hyphenation (Sjur and Thomas)
- continue with speller data generation/conversion (Tomi)