Difference between revisions of "Magahi and English"

From LING073
Jump to: navigation, search
 
(Final Evaluation)
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
Resources for machine translation between English and [https://wikis.swarthmore.edu/ling073/Magahi Magahi]
 
[[Category:Sp21_TranslationPairs]][[Category:English]][[Category:Magahi]]
 
[[Category:Sp21_TranslationPairs]][[Category:English]][[Category:Magahi]]
 +
 +
= External resources =
 +
* [https://github.swarthmore.edu/Ling073-sp21/ling073-mag-eng Language Pair Repository]
 +
** [https://wikis.swarthmore.edu/ling073/Magahi_to_English_Evaluation Evaluation of Translator]
 +
* [https://github.swarthmore.edu/Ling073-sp21/ling073-mag Magahi Transducer Repository]
 +
* [https://github.com/apertium/apertium-eng English Transducer Repository]
 +
* [https://wikis.swarthmore.edu/ling073/Magahi_and_English/Contrastive_Grammar Contrastive Grammar]
 +
* Corpus Repositories
 +
** [https://github.swarthmore.edu/Ling073-sp21/ling073-mag-eng-corpus Parallel Corpus]
 +
** [https://github.swarthmore.edu/Ling073-sp21/ling073-mag-corpus Just Magahi]
 +
 +
= Additions =
 +
* 106 new words
 +
* 3 new morphological disambiguation rules
 +
* 3 new transfer rules: objects, adverbs, pronouns
 +
 +
= Final Evaluation =
 +
* Precision and Recall
 +
** Precision: 96.58120%
 +
** Recall: 83.91089%
 +
* Coverage: 47.103343307228314662%
 +
* Words in large corpus: 3678603
 +
* Stems in transducer: 359
 +
* WER over longer: 95.79%
 +
* PER over longer: 80.06%
 +
* Proportion of stems translated correctly: 80.16%
 +
* Trimmed coverage
 +
** longer: 31.049624650363566%
 +
** large: 80.37135278514589%
 +
* Number of tokens in
 +
** longer: 321
 +
** large: 2,661,822

Latest revision as of 12:07, 10 May 2021

Resources for machine translation between English and Magahi

External resources

Additions

  • 106 new words
  • 3 new morphological disambiguation rules
  • 3 new transfer rules: objects, adverbs, pronouns

Final Evaluation

  • Precision and Recall
    • Precision: 96.58120%
    • Recall: 83.91089%
  • Coverage: 47.103343307228314662%
  • Words in large corpus: 3678603
  • Stems in transducer: 359
  • WER over longer: 95.79%
  • PER over longer: 80.06%
  • Proportion of stems translated correctly: 80.16%
  • Trimmed coverage
    • longer: 31.049624650363566%
    • large: 80.37135278514589%
  • Number of tokens in
    • longer: 321
    • large: 2,661,822