meeting_2007-11-20

Meeting with Polderland 20.11.2007

Participants:

  • Peter Beinema
  • Sjur Moshagen

Agenda

  • Since last time
  • Possible issues
  • Next meeting
  • Todo items for the next meeting

Since last time

Polderland

  • inDesign spellers work except for custom dictionaries (FN)
    • some work left, delayed till after BOE fix (multipart compounds)
  • BOE modifications completed, testing was finished yesterday evening
    • if no errors were found drop will be today
      • (probably combined with analysing errors in more than compound part)
      • and MS bug (misspell change + add)
  • Win installer/extractor:
    • series of experiments done, but no positive results
      • Package Builder appears to be a very good tool,
      • but calling installshield in extracted temp environ results in silence
  • WA is working on installer bugs
    • Currently Ill, expected back on thursday.
    • Intermediate results: some of the bugs can not be changed by us
  • Hyphenator bugs fixed and dropped (except for Indesign versions)
  • Compounding bug 461 investigated, not a bug
    • suggestions can be explained by phonetic rules and penalty settings
  • Compounding bug 522: not general 4 part-compound problem, but related to definition of goahtte - probaly lexical issue, am currently digging in lex
  • Bug 568: not a bug. Token ending in hyphen: 1st attempt is to see it as 1 lexical word, 2nd is to see it as "word without hyphen" + "-"
  • 576: appears to be real bug. Under investigation.

Planned drop today:

  • BOE change
    • L & R flags then invalid, replace with BO and OE respectively
    • support for the H flag removed, can be replaced with L + hard hyphen at the end of the word
  • MS bug fix
  • multiple errors in multiple parts of dynamic compound

Next drops:

  • hyphenator for InDesign
  • speller for InDesign (hopefully this week)

Open issues after these drops:

  • Windows installer issues
  • 576 – heajus vs heajos– (hyphens replaced with n-dash to please forrest)
    • heajos– should be accepted (with hyphen only) – it is
    • heajus should be accepted – it is
    • heajos should NOT be accepted – *** it IS accepted
    • heajos–Oslo should be accepted – *** it is NOT accepted, but the exact same string is suggested, and when inserted (corrected) gets a red underline immediately
    • heajus–Oslo should NOT be accepted – is NOT accepted
    • tested with "Davvisámi, public beta 2, 2007–11–16"
  • 568 – it is a "bug" from the user´s perspective
    • if lex contains "xxx– BO" and "xxx I" xxx– will be recognized as "xxx" + "–"
    • if lex contains "xxx– BO" and NO "xxx I", xxx– will NOT be accepted by the present speller as "xxx" + "–"
    • this implies that Divvun has "xxx I" somewhere in the lexicon, which is a bug

Re bug 568:

beai-vi--       NALX
beai-vi NIR
beai-ve--       NIAL
beai-ve NAL

"beaive" shold not be accepted, whereas beaivi is ok. Actually, the PLX entries seems to be ok. Divvun will have to investigate further.

Divvun

  • new speller with the PL hyphenation fix included released
  • still more PLX conversion fixes
  • first crude version of hyphenation test bench
  • linguistic work now finished on both languages - still some testing and bugfixing

Bug status

Solved/cancelled

Div Pld Description
545 xxx Bad hyphenation in compounds - fence post errors? There seems to be an error in the Polderland code => fixed, being tested

To be solved by Divvun

Div Pld Description
461 448 Spelling errors with editing distance 1 from lexicalised words do not get correct suggestions => explanation not good enough, investigate more samples *** still investigating, low priority => investigated, it is all caused by phonetic rules interaction. Possible solutions: either remove phonetic rules that conflict with the wanted correction (which will have the positive side effect of speeding up the speller), or make the rules more specific, by extending the rule context (which will probably require more rules, which will make the speller slower), or lower the penalty for substitution to 6 or so (but with very unpredictable results on the overall performance)
568 xxx Speller accepts X-flagged forms without I flag, also hyphenated words not in lexicon

To be solved in next drop

Div Pld Description
522 xxx Strange compounding fenomena
524 xxx Multi-part compounds not always accepted
557 xxx Missing suggestion when multiple errors across compound boundary

Under investigation by Polderland

Div Pld Description
473 452 Windows installer does not autostart after download
516 4?? vista installation from zip fails => depends on the autostart download issue
554 xxx Uninstallation under Windows Vista fails
561 xxx Installer language choice doesn´t work when upgrading
562 xxx Win upgrade does not upgrade
563 xxx Dutch text in installer upgrade dialog
564 xxx Win installer asking for non-existing disk
565 xxx Clicking "Back" in installer doesn´t word as expected
576 xxx Speller does not accept the correct string, but do suggest the same as input.

Priority list

  1. multipart compounding fix
  2. Windows installer fixes
  3. other bug fixes
  4. MS-reported speller fix (not Sami-specific)
  5. indesign speller

Issues

Speed

Lingsoft spellers are much faster than the Polderland ones, although LS also allows free compounding. Also the Windows version is slow on suggestions for long words.

Discussed, see above, and bug 461.

Schedule

  • November 20: multipart compound fix
  • November 20: other bug fixes
  • November 22: Divvun code freeze
  • November 20: Indesign speller
  • December 11-13: public release (one of those days)

Divvun would like feedback on copyright note on CD and CD cover.

Next meeting

Next week (27.11.) at 09.30 Dutch/Norwegian time.

TODO:

  • PLD pass information on self-extracting ziptool with autostart option + instructions on how to use it (not in working order yet) - pending
  • PLD provide information on InDesign language grouping
  • PLD preferred order of results: see above
  • PLD bug fixes with deliveries (see above)