Difference between revisions of "Kaingang and Portuguese/Structural Transfer"

From LING073
Jump to: navigation, search
Line 32: Line 32:
  
 
*'''Coverage:'''
 
*'''Coverage:'''
 +
$ aq-covtest ling073-kgp-por-corpus/kgp.tests.txt ling073-kgp-por/kgp-por.automorf.bin
 +
Number of tokenised words in the corpus: 82
 +
Coverage: 64.63%
 +
Top unknown words in the corpus:
 +
3 vỹ
 +
2 fi
 +
1 kafã
 +
1 Nũgnũj
 +
1 kur
 +
1 ẽgno
 +
1 tũg
 +
1 São
 +
1 Pau
 +
1 o
 +
1 rã
 +
1 jur
 +
1 tá
 +
1 Téj
 +
1 ki
 +
1 panh
 +
1 kãfór
 +
1 kyrũ
 +
1 jãmré
 +
1 ũn
 +
Translation time: 0.0028295516967773438 seconds
 +
  
 
==Examples for implementation==
 
==Examples for implementation==

Revision as of 19:53, 16 April 2019

Pre-evaluation

Statistics about input files
-------------------------------------------------------
Number of words in reference: 63
Number of words in test: 63 
Number of unknown words (marked with a star) in test: 26
Percentage of unknown words: 41.27 %

Results when removing unknown-word marks (stars)
-------------------------------------------------------
Edit distance: 38
Word error rate (WER): 60.32 %
Number of position-independent correct words: 25
Position-independent word error rate (PER): 60.32 %

Results when unknown-word marks (stars) are not removed
-------------------------------------------------------
Edit distance: 63
Word Error Rate (WER): 100.00 %
Number of position-independent correct words: 0
Position-independent word error rate (PER): 100.00 % 
Statistics about the translation of unknown words
-------------------------------------------------------
Number of unknown words which were free rides: 25
Percentage of unknown words that were free rides: 96.15 %

  • WER: 60.32%
  • PER: 60.32%
  • Coverage:
$ aq-covtest ling073-kgp-por-corpus/kgp.tests.txt ling073-kgp-por/kgp-por.automorf.bin
Number of tokenised words in the corpus: 82
Coverage: 64.63%
Top unknown words in the corpus:
3	 vỹ
2	 fi
1	 kafã
1	 Nũgnũj
1	 kur
1	 ẽgno
1	 tũg
1	 São
1	 Pau
1	 o
1	 rã
1	 jur
1	 tá
1	 Téj
1	 ki
1	 panh
1	 kãfór
1	 kyrũ
1	 jãmré
1	 ũn
Translation time: 0.0028295516967773438 seconds


Examples for implementation

Sentence:

Tagger output:

Biltrans output:

Chunker output:

Interchunk output:

Postchunk output:

kgp-por output:

Post-evaluation

  • WER:
  • PER:
  • Coverage: