Misc tools

From LING073
Revision as of 11:39, 20 February 2018 by Jwashin1 (talk | contribs)

Jump to: navigation, search

You can install morphTests2yaml as follows:

cd ~/ling073
git clone git@github.swarthmore.edu:Ling073-sp18/tools.git
cd tools
make install

You can scrape {{morphTest}}s from any page on this wiki like this:

morphTests2yaml "Language/Grammar" -l xyz

Put the file in a tests/ directory in your transducer repo and commit the file.

Using the script

Here's an example of how to use the script and what sort of output you might get:

$ aq-covtest ita.corpus.txt ita.automorf.bin 
Number of tokenised words in the corpus: 891
Coverage: 92.14%
Top unknown words in the corpus:
4	 PCUS
3	 b
3	 Passera
2	 Sancti
2	 varesotto
2	 Portuense
1	 Q
1	 aderisce
1	 Sojuza
1	 Iohannis
1	 of
1	 seguitò
1	 Kommunističeskaja
1	 PCU
1	 perestrojka
1	 png
1	 Brìnzio
1	 POSDR
1	 Brinš
1	 Shetlands
Translation time: 0.07985591888427734 seconds

2017 version of this page