meeting_2007-01-16
Meeting with Polderland 16.1.2007
Participants:
- Peter Beinema
- Sjur Moshagen
Agenda
- Since last time
- Possible issues
- Next meeting
Since last time
Polderland:
- no response from MS on language codes for Mac yet, will poll them again
- mklex: beta, awaiting test results
Divvun:
PLX conversion now progressing fine again, all major POSes, + closed POSes are
Possible issues
The big lexicon file (25+ Gb) contains large portions of words starting with:
- - (hyphen; 0x2d) (3 Gb)
- e (0x65; 8 Gb)
Could it be earon— or something similar?
It is eahpá-'', which corresponds to un-'' in English. It used to be
Name "prefixes"
Some nouns are common as prefixes to names, mainly words for North,
- davvi-Norgga -> Davvi-Norgga (= North(ern) Norway)
That is:
name-prefix + name => upper case + hyphen
Thus, to correctly handle these cases, we need to identify names as different from other nouns, such that we can direct the upppercased and hyphenated
Hyphen as prefix
In constructions of coordinated compounds with common first part (YX and YZ =>
Next meeting
Next Tuesday (23.1.) at the usual time.
TODO:
- get back on linguistic issue regarding proper nouns vs. common nouns
- get back on linguistic issue re. hyphen as prefix
- check if North Sámi hyphenation can be disabled when processing Lule Sámi (PLD)
- make complete PLX data set (Tomi)
- progressing
- progressing
- get language codes to work with Mac Office 2004 (and check MacOffice 2007)
- deliver mklex + hyphen script
- try to find proper compiler version for Adobe Indesign (old version will
- try to get an answer to the language codes in MS Office for Mac question from
- investigate the initial "e" group of words (8 Gb) (Sjur)
- done