Difference between revisions of "Nuosu and Mandarin"

From LING073
Jump to: navigation, search
(External Resources)
(Resources for machine translation between Nuosu and English)
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
= Resources for machine translation between Nuosu and English =
 
= Resources for machine translation between Nuosu and English =
  
=== External Resources  ===
+
=== Developed Resources  ===
 
*[https://github.com/AgnesWang000/Nuosu-Mandarin-RBMT Nuosu-Mandarin Machine Translation Repository]
 
*[https://github.com/AgnesWang000/Nuosu-Mandarin-RBMT Nuosu-Mandarin Machine Translation Repository]
*[https://github.swarthmore.edu/Ling073-sp22/apertium-zho Mandarin(zho) Transducer Repository]
+
*[https://github.com/AgnesWang000/Apertium-zho Mandarin(zho) Transducer Repository]
*[https://github.com/AgnesWang000/Nuosu-Transducer Nuosu(iii) Transducer]
+
*[https://github.com/AgnesWang000/Nuosu-Transducer Nuosu(iii) Transducer Repository]
 +
*[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Contrastive_grammar Contrastive Grammar Documentation]
 +
*[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Lexical_selection Lexical Selection Documentation]
 +
*[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Structural_transfer Structural Transfer Documentation]
 +
*[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Final_Project#Links_Public_Git_Repo_and_Poster Final Project: State-of-the-art MT and Transducer]
 +
 
 +
==Polished RBMT System==
 +
 
 +
=== Nuosu Transducer ===
 +
* Over : bible2.2022.1.31.txt (~249502 characters)
 +
** Precision: 97.6%
 +
** Recall: 86.46%
 +
** Coverage over large corpus: 75.43%
 +
** Stems in transducer: 300+
 +
 
 +
=== MT iii->zho ===
 +
* Over : iii.test.txt (30+ sentences)
 +
** Word Error Rate (WER):  38.6%
 +
** Position-independent word error rate (PER):  22.81%
 +
** Coverage: 78.5%
  
 
== Additions ==
 
== Additions ==
Line 21: Line 40:
 
===Adding Stems===
 
===Adding Stems===
 
* Added 100 new stems
 
* Added 100 new stems
 
==Polished RBMT System==
 
 
=== Nuosu Transducer ===
 
* Precision: %
 
* Recall: %
 
* Coverage over large corpus: 182179 / 243953 (~0.74677909269408451628) remaining unknown forms: 61774
 
* Stems in transducer: 276
 
 
 
=== MT iii->zho ===
 
* Over : iii.longer.text
 
** Word Error Rate (WER):  %
 
** Position-independent word error rate (PER):  %
 
** Percentage of unknown words:  %
 
** Number of position-independent correct words:
 
** Coverage:
 
 
 
 
 
 
  
  
 
[[Category: Nuosu]] [[Category: sp22_TranslationPairs]]
 
[[Category: Nuosu]] [[Category: sp22_TranslationPairs]]

Latest revision as of 21:23, 12 May 2022

Resources for machine translation between Nuosu and English

Developed Resources

Polished RBMT System

Nuosu Transducer

  • Over : bible2.2022.1.31.txt (~249502 characters)
    • Precision: 97.6%
    • Recall: 86.46%
    • Coverage over large corpus: 75.43%
    • Stems in transducer: 300+

MT iii->zho

  • Over : iii.test.txt (30+ sentences)
    • Word Error Rate (WER): 38.6%
    • Position-independent word error rate (PER): 22.81%
    • Coverage: 78.5%

Additions

Disambiguation

  • Added a rule that distinguishes adverbs and main verbs (ex. ꃅ)
  • Added a rule that distinguishes exhaustion particles and main verbs for (ex. ꌐ)
  • Added a rule that distinguishes end-of-sentence particles and main verbs for (ex. ꄉ)

Structural Transfer

  • Added one more sentence for each of the five rules, bringing the total to 10 sentences
  • Added a rule which added an adjective-specific marker when translating to Mandarin.
  • Added a rule which specified the explicit topic marker in Nuosu
  • Changed the rule that specified the organization of NP and ClP phrases

Adding Stems

  • Added 100 new stems