meeting_2007-10-15

Meeting with Polderland 15.10.2007

Participants:

  • Peter Beinema
  • Sjur Moshagen

Agenda

  • Since last time
  • Possible issues
  • Next meeting
  • Todo items for the next meeting

Since last time

Polderland

  • fixing bugs: some bugs pending
  • compounding behaviour:
    • 3/4/5-part compound appear to work in general, but not in some specific cases
      • insufficient Mac disk space, will get new disk this afternoon
      • investigating costs of modification of PLX flags. See some ideas below.
  • working on zip utility for windows installer
  • Speller behaviour: only suggests if errors are limited to 1 part of compound.
    • undesired (or should be user setting),
    • will be changed in future drop
  • work has started on Indesign Spellers for North + Lule Sami
    • speller starts to work, skeleton version recognises words
    • (= real calls, pre-programmed responses)
    • now adding the real functionality; relativey straightforward
  • work on command line hyphenator for Mac nearly completed

Ideas for PLX flag modifications:

L + R possibilities are limited
  L: anything but rightmost
  R: anything but leftmost
considered:
  B: leftmost only
  E: rightmost only
  ? (not M, in any case): anywhere but leftmost or rightmost

Development costs are probably prohibitive, though, first estimate is over 40 hrs of work. It would not be a ~Sami-specific change, but was not planned.

Divvun

  • released the InDesign hyphenator to beta testers
  • major remaining bugs relate to limits in the PLX formalism:
    • adj+noun compounding do not follow the same rules as noun+noun or adj+adj
      • this was solved by encoding all adjectives as nouns in the PLX data set
    • N-part compounds either overgenerate in a bad way, or severely undergenerate
      • severely limits functionality; according to Divvun not in spec.
      • should be fixed (maybe later than originally planned final drop)
  • added Acro case ending casing to Bugzilla (INTERREG: i vs INTERREG: I)
    • hope to solve this one by adding phon rules for lower-casing suggestions
      • won't work, since the uppercasing is done after the phon rules are applied
  • improved Mac uninstaller - it is now a double-clickable application, and it removes all installed files, not only the proofing tools in the MS folder

Bug status

Solved/cancelled

None since last meeting.

To be solved in next drop

Div Pld Description
473 452 Windows installer does not autostart after download => discuss solutions
516 4?? vista installation from zip fails => probably solved by self-extracting zip; not functioning yet

Under investigation by Polderland

Div Pld Description
461 448 Spelling errors with editing distance 1 from lexicalised words do not get correct suggestions => explanation not good enough, investigate more samples *** still investigating
522 xxx Strange compounding fenomena
528 xxx Acro words get upper-case case endings
533 xxx Office 2007 gives North Sámi suggs in context menu when language is Lule => Divvun will check with the Public beta 2.

To be investigated by Divvun/Users

Div Pld Description
447 443 Windows speller doesn't install on terminal server => SD IT guys will have a look once more

Issue discussion

473 - Windows installer

Alternatives to solve it:

  • make an executable installer file - requires InstallShield, meaning that either Polderland will have to make each installation package, or that the Divvun project will have to get an InstallShield license
  • make self-extracting zip files for download, with autostart options of the extracted objects; some alternative zippers for this:

For both cases different types of protection software (firewalls, anti-virus, etc) might block either download or execution of the installer. But default InstallShield installers behave in the same (or a similar) way, so that should not be any different.

Casing of suggestions

Example of acro words:

INTERREG
Inflected: INTERREG:s (ie lower-case case endings in normal usage)

user types: INTERREG:ii

Suggestion: INTERREG:I
Expected: INTERREG:i (= lexicon entry)

lexicon has: INTERREG:i

The case conversion of the suggestions happens after lexicon lookup, and there is no knowledge anymore of the lexical form of the suggestions, ie, no knowledge that the word is upper case, but the case ending is lower.

Rough schedule

Polderland didn't meet their self-imposed goal of delivering at the beginning of September, will try around mid September instead.

  • Mid September: planned final drop from Polderland
    • did drop, new drops still expected: new Windows installer, etc
  • November 1: Divvun code freeze
  • November 15: Indesign speller
  • December 11: public release

Next meeting

Next week (23.10.) at 09.30 Dutch/Norwegian time.

TODO:

  • PLD discuss limits in N-part compound functionality with mgmt team, get back to Divvun ASAP
  • PLD pass information on self-extracting ziptool with autostart option + instructions on how to use it (not in working order yet) - pending
  • PLD provide information on InDesign language grouping
  • Divvun check bug 533 with latest release