Difference between revisions of "Berik and English"
(→Final Numbers) |
(→Final Numbers) |
||
Line 71: | Line 71: | ||
* Precision: 65.88785% | * Precision: 65.88785% | ||
* Recall: 88.67925% | * Recall: 88.67925% | ||
+ | * Large Corpus | ||
+ | ** Word count: 365010 | ||
+ | ** Coverage: 55.34% | ||
+ | * Stems in transducer: 400 | ||
[[Category:Sp17_TranslationPairs]] | [[Category:Sp17_TranslationPairs]] |
Revision as of 16:03, 24 April 2018
Resources for machine translation between Berik and English.
Contents
bkl -> eng evaluation
Statistics about input files
Number of words in reference: 55
Number of words in test: 40
Number of unknown words (marked with a star) in test: 15
Percentage of unknown words: 37.50 %
Edit distance: 54
Word error rate (WER): 98.18 %
Number of position-independent correct words: 1
Position-independent word error rate (PER): 98.18 %
Results when unknown-word marks (stars) are not removed
Edit distance: 55
Word Error Rate (WER): 100.00 %
Number of position-independent correct words: 0
Position-independent word error rate (PER): 100.00 %
Statistics about the translation of unknown words
Number of unknown words which were free rides: 1
Percentage of unknown words that were free rides: 6.67 %
Final Evaluation
Initial Precision & Recall
- Precision: 100.00000%
- Recall: 78.03738%
Adding Stems
136 stems were added.
Disambiguation
4 new disam rules were added to deal with "ane" which can be "and" or "many".
Structural Transfer
- Added articles in copular phrases.
- "Ai taneyan."
- "I am not child." -> "I am not a child."
- Added rules for positive copular phrases.
- "Je bwernabar namwer."
- "He sick now." -> "He is sick now."
- Added verbs and tense marking.
- "Gwirmir wini as damtafa."
- "Tomorrow woman #prpers #see." -> "Tomorrow woman #prpers will see."
- Added prepositions for instrumental case.
- "Je twena ginem tana."
- "He pig #arrow #kill." -> "He pig killed with an arrow."
- Corrected word order of instransitive clauses with instrumentals.
- "Korano atem difnant."
- "chief #canoe #come." -> "chief came with a canoe."
Final Numbers
- Precision: 65.88785%
- Recall: 88.67925%
- Large Corpus
- Word count: 365010
- Coverage: 55.34%
- Stems in transducer: 400