Berik and English

From LING073
Revision as of 15:50, 24 April 2018 by Dswanso1 (Talk | contribs) (Disambiguation)

Jump to: navigation, search

Resources for machine translation between Berik and English.

bkl -> eng evaluation

Statistics about input files


Number of words in reference: 55

Number of words in test: 40

Number of unknown words (marked with a star) in test: 15

Percentage of unknown words: 37.50 %


Edit distance: 54

Word error rate (WER): 98.18 %

Number of position-independent correct words: 1

Position-independent word error rate (PER): 98.18 %

Results when unknown-word marks (stars) are not removed


Edit distance: 55

Word Error Rate (WER): 100.00 %

Number of position-independent correct words: 0

Position-independent word error rate (PER): 100.00 %

Statistics about the translation of unknown words


Number of unknown words which were free rides: 1

Percentage of unknown words that were free rides: 6.67 %

Final Evaluation

Initial Precision & Recall

  • Precision: 100.00000%
  • Recall: 78.03738%

Adding Stems

136 stems were added.

Disambiguation

4 new disam rules were added to deal with "ane" which can be "and" or "many".

Structural Transfer

  • Added articles in copular phrases.
    • "Ai taneyan."
    • "I am not child." -> "I am not a child."
  • Added rules for positive copular phrases.
    • "Je bwernabar namwer."
    • "He sick now." -> "He is sick now."
  • Added verbs and tense marking.
    • "Gwirmir wini as damtafa."
    • "Tomorrow woman #prpers #see." -> "Tomorrow woman #prpers will see."
  • Added prepositions for instrumental case.
    • "Je twena ginem tana."
    • "He pig #arrow #kill." -> "He pig killed with an arrow."
  • Corrected word order of instransitive clauses with instrumentals.
    • "Korano atem difnant."
    • "chief #canoe #come." -> "chief came with a canoe."

Final Numbers

Will go here later!