Using flag diacritics for faster compilation
lexd
is a bit slow to compile dictionaries that make use of prefixes with large lexicons (or similar).
To speed this up, you can enable compiling to flag diacritics.
Contents
Implementing the change
First thing you want to do is edit your Makefile.am
:
- remove
autogen.bin
andautomorf.bin
lines from theTARGETS_COMMON
list at the top - add
-f
as a switch to thelexd
commmands - add
-F
as a switch to thehfst-compose
command in the theautomorf.hfst
line
Be sure to rerun ./autogen.sh
when you're done.
Also make sure to do all of this the same way in your multilingual pair.
The changes will look something like this:
TARGETS_COMMON = \ $(LANG1).automorf.hfst \ - $(LANG1).automorf.bin \ $(LANG1).automorf.att.gz \ $(LANG1).autogen.hfst \ - $(LANG1).autogen.bin \ $(LANG1).autogen.att.gz \ $(LANG1).autopgen.bin \ $(LANG1).rlx.bin \ $(LANG1).zhfst
.deps/$(LANG1).RL.lexd.hfst: .deps/$(LANG1).RL.lexd - lexd $< .deps/$(LANG1).RL.lexd.att + lexd -f $< .deps/$(LANG1).RL.lexd.att hfst-txt2fst .deps/$(LANG1).RL.lexd.att -o $@ .deps/$(LANG1).LR.lexd.hfst: .deps/$(LANG1).LR.lexd - lexd $< .deps/$(LANG1).LR.lexd.att + lexd -f $< .deps/$(LANG1).LR.lexd.att hfst-txt2fst .deps/$(LANG1).LR.lexd.att -o $@
$(LANG1).automorf.hfst: .deps/$(LANG1).LR.hfst .deps/$(LANG1).spellrelax.hfst - hfst-compose -1 $< -2 .deps/$(LANG1).spellrelax.hfst | hfst-invert | hfst-fst2fst -O -o $@ + hfst-compose -F -1 $< -2 .deps/$(LANG1).spellrelax.hfst | hfst-invert | hfst-fst2fst -O -o $@
Additional steps
Updating modes.xml
You'll want to change all the instances of lt-proc
and auto*.bin
in your modes.xml
file to hfst-proc
and auto*.hfst
, respectively.
Make sure to do the same in your multilingual pair.
Running hfst-expand and hfst-lookup
You'll also need to add -X obey-flags
to hfst-expand
or hfst-lookup
any time you run those commands.
Running tests
In theory you shouldn't have to do anything extra?
But it's a good idea to run regression tests after implementing these changes!