meeting_2007-10-02
Meeting with Polderland 2.10.2007
Participants:
- Peter Beinema
- Sjur Moshagen
Agenda
- Since last time
- Possible issues
- Next meeting
- Todo items for the next meeting
Since last time
Polderland
- dropped: new version of MS windows installer, incl. Lule Sami hyphenation
- pending, working on new installer (bugs 455 and 421):
Mac installer for Office, including Lule Sami Hyphenator - dropped: North Sami hyphenator for Mac InDesign
- pending: Lule Sami hyphenator for Mac InDesign:
problem with language identification when combined with North Sami hyphenator - fixing bugs: some bugs pending (461, 521, 522)
- compounding behaviour:
- 3/4/5-part compound appear to work in general,
but not in some specific cases- investigating costs of modification of PLX flags. See some ideas below.
- investigating costs of modification of PLX flags. See some ideas below.
- 3/4/5-part compound appear to work in general,
- mac speller: installation issue (admin rights)?
- zip utility for windows: not functional yet
- first part works, calling REAL language-dependent installer appears to crash
- have to do some debugging there
- first part works, calling REAL language-dependent installer appears to crash
- Speller behaviour: only suggests if errors are limited to 1 part of compound.
- undesired (or should be user setting),
- will be changed in future drop
- undesired (or should be user setting),
Ideas for PLX flag modifications:
L + R possibilities are limited L: anything but rightmost R: anything but leftmost considered: B: leftmost only E: rightmost only ? (not M, in any case): anywhere but leftmost or rightmost
Development costs are probably prohibitive, though, first estimate is over 40
Divvun
- testing, bug hunting and fixing during the whole week in Tromsø, major
linguistic bug fixes. The quality is approaching release quality for North Sámi, except for N-part compounds. Lule Sámi ok, but not as good as North, mainly due to lack of corpus texts - major remaining bugs relate to limits in the PLX formalism:
- adj+noun compounding do not follow the same rules as noun+noun or adj+adj
(which means we overgenerate in some cases, and undergenerate in others)- will be further investigated by the Divvun team
- will be further investigated by the Divvun team
- N-part compounds either overgenerate in a bad way, or severely undergenerate
- adj+noun compounding do not follow the same rules as noun+noun or adj+adj
Bug status
Solved/cancelled
| Div | Pld | Description |
|---|---|---|
| 402 | 439 | word + hyphen + comma is not recognised ("xxx-,") => solved |
| 419 | 440 | 3-part compounds not recognised => solved => solved |
| 448 | 444 | Upper-case words get suggestions with Initial Case => solved |
| 449 | 445 | suopmasápmelaš-type compounds accepted by the speller => limits of tagging /compounding mechanism reached |
To be solved in next drop
| Div | Pld | Description |
|---|---|---|
| 455 | 446 | Mac uninstaller doesn't work if run by a non-admin => solved |
| 473 | 452 | Windows installer does not autostart after download => discuss solutions |
| 516 | 4?? | vista installation from zip fails => probably solved by self-extracting zip; not functioning yet |
Under investigation by Polderland
| Div | Pld | Description |
|---|---|---|
| 461 | 448 | Spelling errors with editing distance 1 from lexicalised words do not get correct suggestions => explanation not good enough, investigate more samples *** still investigating |
| 480 | 455 | Speller suggestions not identical in context menu and dialog => MS word behaviour: sorting of same-penalty suggestions; => no plans to repair this one |
| 521 | xxx | Mac installer only works for admin accounts => |
| 522 | xxx | Strange compounding fenomena |
To be investigated by Divvun/Users
| Div | Pld | Description |
|---|---|---|
| 447 | 443 | Windows speller doesn't install on terminal server => SD IT guys will have a look once more |
Issue discussion
473 - Windows installer
Alternatives to solve it:
- make an executable installer file - requires InstallShield, meaning that
either Polderland will have to make each installation package, or that the Divvun project will have to get an InstallShield license - make self-extracting zip files for download, with autostart options of the
extracted objects; some alternative zippers for this:
For both cases different types of protection software (firewalls, anti-virus,
Speed
With more than 2-part compounds now available, suggestion speed is sometimes very slow, in the range of 10 seconds.
Casing of suggestions
Example of acro words:
INTERREG Inflected: INTERREG:s (ie lower-case case endings in normal usage) user types: INTERREG:ii Suggestion: INTERREG:I Expected: INTERREG:i (= lexicon entry) lexicon has: INTERREG:i
Rough schedule
Polderland didn't meet their self-imposed goal of delivering at the beginning of
- Mid September: planned final drop from Polderland
- November 1: Divvun code freeze
- November 15: Indesign speller
- December 11: public release
Next meeting
Next week (9.10.) at the usual time.
TODO:
-
PLD drop first version of indesign Hyphenator - done
-
PLD pass information on self-extracting ziptool with autostart option
+ instructions on how to use it (not in working order yet) - pending -
PLD Proposal for command line hyphenator: 12/9
ready to be sent by commercial dept - sent -
PLD do speed tests: Mac/PPC Mac/c2duo Win, Office 11 Office 12
(compounding especially) - interpreter issue (Rosetta): much faster on much slower systems when not interpreted, factor of 20+ Office 2008 has native intel code, will be much faster PLD code is universal binary, but depends for speed on calling method -
PLD provide information on InDesign language groouping
-
Divvun investigate further the A+N compounds
- Divvun add Acro case ending casing to Bugzilla (INTERREG: i)

