meeting_2007-10-09

Meeting with Polderland 9.10.2007

Participants:

  • Peter Beinema
  • Sjur Moshagen

Agenda

  • Since last time
  • Possible issues
  • Next meeting
  • Todo items for the next meeting

Since last time

Polderland

  • improved mac installer for office (sp+hy)*(North+Lule) dropped
  • dropped: Lule Sami hyphenator for Mac InDesign; language code problem fixed
  • fixing bugs: some bugs pending
  • compounding behaviour:
    • 3/4/5-part compound appear to work in general, but not in some specific cases
      • investigating costs of modification of PLX flags. See some ideas below.
  • working on zip utility for windows installer
  • Speller behaviour: only suggests if errors are limited to 1 part of compound.
    • undesired (or should be user setting),
    • will be changed in future drop
  • work has started on Indesign Spellers for North + Lule Sami
  • work has started on command line hyphenator for Mac

Ideas for PLX flag modifications:

L + R possibilities are limited
  L: anything but rightmost
  R: anything but leftmost
considered:
  B: leftmost only
  E: rightmost only
  ? (not M, in any case): anywhere but leftmost or rightmost

Development costs are probably prohibitive, though, first estimate is over 40 hrs of work. It would not be a ~Sami-specific change, but was not planned.

Divvun

  • released public beta 2 of MS Office tools: with smj hyphenator and improved Mac installer
  • has tested the smj hyphenator in MS Office
  • has tested the InDesign hyphenators
  • will release the InDesign hyphenator to beta testers the coming days
  • major remaining bugs relate to limits in the PLX formalism:
    • adj+noun compounding do not follow the same rules as noun+noun or adj+adj (which means we overgenerate in some cases, and undergenerate in others)
      • will be further investigated by the Divvun team
    • N-part compounds either overgenerate in a bad way, or severely undergenerate
  • added Acro case ending casing to Bugzilla (INTERREG: i vs INTERREG: I)

Bug status

Solved/cancelled

Div Pld Description
455 446 Mac uninstaller doesn't work if run by a non-admin => solved
449 445 suopmasápmelaš-type compounds accepted by the speller => limits of tagging /compounding mechanism reached
480 455 Speller suggestions not identical in context menu and dialog => MS word behaviour: sorting of same-penalty suggestions; => no plans to repair this one
521 xxx Mac installer only works for admin accounts =>

To be solved in next drop

Div Pld Description
473 452 Windows installer does not autostart after download => discuss solutions
516 4?? vista installation from zip fails => probably solved by self-extracting zip; not functioning yet

Under investigation by Polderland

Div Pld Description
461 448 Spelling errors with editing distance 1 from lexicalised words do not get correct suggestions => explanation not good enough, investigate more samples *** still investigating
522 xxx Strange compounding fenomena
528 xxx Acro words get upper-case case endings
533 xxx Office 2007 gives North Sámi suggs in context menu when language is Lule => Divvun will check with the Public beta 2.

To be investigated by Divvun/Users

Div Pld Description
447 443 Windows speller doesn't install on terminal server => SD IT guys will have a look once more

Issue discussion

473 - Windows installer

Alternatives to solve it:

  • make an executable installer file - requires InstallShield, meaning that either Polderland will have to make each installation package, or that the Divvun project will have to get an InstallShield license
  • make self-extracting zip files for download, with autostart options of the extracted objects; some alternative zippers for this:

For both cases different types of protection software (firewalls, anti-virus, etc) might block either download or execution of the installer. But default InstallShield installers behave in the same (or a similar) way, so that should not be any different.

Casing of suggestions

Example of acro words:

INTERREG
Inflected: INTERREG:s (ie lower-case case endings in normal usage)

user types: INTERREG:ii

Suggestion: INTERREG:I
Expected: INTERREG:i (= lexicon entry)

lexicon has: INTERREG:i

The case conversion of the suggestions happens after lexicon lookup, and there is no knowledge anymore of the lexical form of the suggestions, ie, no knowledge that the word is upper case, but the case ending is lower.

Adj+noun compounds

A possible solution would be to give all nominals the same PLX code. Since the codes are generated, and not used for anything but control the speller behaviour, adhering to the POS "meaning" of the PLX codes is no point in itself. If this can solve the problem, I see no reason not to let N mean "nominal" (ie nouns and adjectives).

Mac uninstallation

The dropped installer is not the latest code. The latest code is using osascript, whereas the dropped version is using a perl module, MacPerl. The installer should re re-dropped with the correct uninstallation code.

Alternatively: compile the applescript into an application (using osacompile); prevents pop-up of Terminal window. A double-clickable app would look less unfamiliar to the unsuspecting user.

Rough schedule

Polderland didn't meet their self-imposed goal of delivering at the beginning of September, will try around mid September instead.

  • Mid September: planned final drop from Polderland
    • did drop, new drops still expected: new Windows installer, etc
  • November 1: Divvun code freeze
  • November 15: Indesign speller
  • December 11: public release

Next meeting

Next week (15.10.) at 13.00 Dutch/Norwegian time.

TODO:

  • PLD pass information on self-extracting ziptool with autostart option + instructions on how to use it (not in working order yet) - pending
  • PLD redrop Mac installer, correct version
  • PLD provide information on InDesign language groouping
  • Divvun investigate further the A+N compounds
  • Divvun check bug 533 with latest release