UiT > Divvun
 
Font size:      

Using Voikko With Hfst

Make or update spellers for Voikko+HFST

Compiling the spellers

The resulting spellers are available both for OpenOffice/LibreOffice, and for command-line tools. The command-line tools and the test bench require that you also build and install libvoikko.

For the languages in the gt directory:

cd $GTHOME/gt/
make GTLANG=sme hfst

For the languages in the langs directory:

(Languages with analysers working to a variable degree are: South Saami, Lule Saami, Komi Zyrian, Erzya Mordvin, Meadow Mari, Faroese, Greenlandic)

  • Download and install LibreOffice 3.6.2 (the newest version) - it knows about several FU languages, e.g. both Komi Zyrian, Komi Permyak, Erzya Mordvin and Meadow Mari.
  • If you already have this LibreOffice, make sure you do not have Voikko from 2010 installed, if you do, uninstall it from the menu Tools > Extension manager > Voikko (Remove)
  • Download and install Voikko LibreOffice plugin with hfst support - downloadable from: divvun.no/static_files/voikko-prealpha-20120125.oxt

This is enough to enable hfst-based spell-checkers in LibreOffice (tested).

To build and test:

1. Compile your language with hfst support (here: kpv as example)

    cd $GTHOME/langs/kpv
    ./configure --with-hfst
    make
    sudo make install

2. in LibreOffice, open Preferences > Language Settings > Voikko and check that the newly installed language is listed (and thus known to Voikko+LibreOffice).

Known problem:

At the moment (26.10.2012), the kvp speller does not work (although it did yesterday). To ensure that the pipeline works, install with e.g. sma instead.

For other languages not in the langs directory:

make -f Makefile.hfst

Using the spellers in OpenOffice / LibreOffice

This can be done with a simple installation of an extesion. The steps are as follows:

  • Install the addition in OpenOffice / LibreOffice
  • compile your hfst spellers as described on the top of the page
  • ensure they are active in OpenOffice/LibreOffice (see above)

That's it! Our own transducers directly applied as spellers!

CAVEATS!!!

This is PRE-Alpha quality!!!

Både hfst som ein del av Voikko og hfst-transdusarane våre er langt i frå ferdige! Det kan finnast problem, og det er kjende feil. Alle lingvistiske problem skal meldast til Bugzilla (eller rettast direkte), alle tekniske feil med Voikko, HFST eller OOo-tillegget skal meldast til Sjur eller til Libvoikko-e-postlista (http: //lists.puimula.org/listinfo/libvoikko).

Known errors:

  • OOo-tillegget fungerer BERRE på snøleoparden, dvs MacOS X 10.6. Det er *ikkje* støtte for 10.5 (enno i alle fall)
  • hfst-stavekontrollen er *deskriptiv*, ikkje normativ - det vil altså vera ein god del ikkje-normativt som blir akseptert
  • hfst-stavekontrollen har ein svært rudimentær forslagsmekanisme - oftast vil det ikkje koma forslag i det heile, og når dei kjem er den korrekte rettinga nesten aldri fyrste forslag

Kjende forbetringar i høve til Hunspellstavekontrollane våre:

  • samansetjingar fungerer! (men overgenererer sidan transduseren er deskriptiv)

Running the test bench with voikko+hfst

This requires libvoikko. The commands are:

cd $GTHOME/gt
make <TESTTYPE> GTLANG=sme TESTTOOL=vkhfst

where <TESTTYPE>is one of:

  • regression-test
  • typos-test
  • wordtype-test
  • baseform-test
  • correct-test

or spelltest to run them all.

Using voikko on the command line

This also requires libvoikko. Just type:

voikkospell -s -d se

See man voikkospell for more details and options.

System-wide speller

Not yet functional!

Eventually, we will get VoikkoSpellService with support for HFST on our computers, which will integrate the voikko speller in the system-wide speller for all languages we compile and install. But at the moment only a single language can be installed, and only in a specific location.