Berik and English
Resources for machine translation between Berik and English.
bkl -> eng evaluation
Statistics about input files
Number of words in reference: 55
Number of words in test: 40
Number of unknown words (marked with a star) in test: 15
Percentage of unknown words: 37.50 %
Edit distance: 54
Word error rate (WER): 98.18 %
Number of position-independent correct words: 1
Position-independent word error rate (PER): 98.18 %
Results when unknown-word marks (stars) are not removed
Edit distance: 55
Word Error Rate (WER): 100.00 %
Number of position-independent correct words: 0
Position-independent word error rate (PER): 100.00 %
Statistics about the translation of unknown words
Number of unknown words which were free rides: 1
Percentage of unknown words that were free rides: 6.67 %
Initial Precision & Recall
- Precision: 100.00000%
- Recall: 78.03738%
136 stems were added.
4 new disam rules were added to deal with "ane" which can be "and" or "many".
- Added articles in copular phrases.
- "Ai taneyan."
- "I am not child." -> "I am not a child."
- Added rules for positive copular phrases.
- "Je bwernabar namwer."
- "He sick now." -> "He is sick now."
- Added verbs and tense marking.
- "Gwirmir wini as damtafa."
- "Tomorrow woman #prpers #see." -> "Tomorrow woman #prpers will see."
- Added prepositions for instrumental case.
- "Je twena ginem tana."
- "He pig #arrow #kill." -> "He pig killed with an arrow."
- Corrected word order of instransitive clauses with instrumentals.
- "Korano atem difnant."
- "chief #canoe #come." -> "chief came with a canoe."
- Precision: 65.88785%
- Recall: 88.67925%
- Large Corpus
- Word count: 365010
- Coverage: 55.34%
- Stems in transducer: 400