Apertium-quality

From LING073
Jump to: navigation, search

Apertium-quality is a set of evaluation tools developed for use with Apertium. The full documentation is available on the Apertium wiki: apertium:Apertium quality.

Some useful tools included are aq-morftest and aq-covtest.

Installation

these are old instructions

Ideally the virtual machines would have these tools installed by default, but they may not.

First you need to get and install mwtools:

sudo apt-get install python3-setuptools
cd ~/Source
git clone https://github.com/jonorthwash/mwtools.git # or use git@github.com:jonorthwash/mwtools.git
cd mwtools/python3
sudo python3 setup.py install

Then you need to get and install apertium-quality:

cd ~/Source
svn checkout http://svn.code.sf.net/p/apertium/svn/trunk/apertium-tools/apertium-quality/
cd apertium-quality
./autogen.sh
make
sudo make install

You can test it by running any of the tools with -h after them. They should show you some short documentation on how to use each tool.

Usage

Here's an example of how to use aq-covtest and what sort of output you might get:

$ aq-covtest ../ling073-xyz-corpus/xyz.corpus.basic.txt xyz.automorf.bin 
Number of tokenised words in the corpus: 891
Coverage: 92.14%
Top unknown words in the corpus:
4	 PCUS
3	 b
3	 Passera
2	 Sancti
2	 varesotto
2	 Portuense
1	 Q
1	 aderisce
1	 Sojuza
1	 Iohannis
1	 of
1	 seguitò
1	 Kommunističeskaja
1	 PCU
1	 perestrojka
1	 png
1	 Brìnzio
1	 POSDR
1	 Brinš
1	 Shetlands
Translation time: 0.07985591888427734 seconds