Difference between revisions of "Nuosu and Mandarin"
From LING073
(→External Resources) |
(→Resources for machine translation between Nuosu and English) |
||
(10 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Resources for machine translation between Nuosu and English = | = Resources for machine translation between Nuosu and English = | ||
− | === | + | === Developed Resources === |
*[https://github.com/AgnesWang000/Nuosu-Mandarin-RBMT Nuosu-Mandarin Machine Translation Repository] | *[https://github.com/AgnesWang000/Nuosu-Mandarin-RBMT Nuosu-Mandarin Machine Translation Repository] | ||
− | *[https://github. | + | *[https://github.com/AgnesWang000/Apertium-zho Mandarin(zho) Transducer Repository] |
− | *[https://github.com/AgnesWang000/Nuosu-Transducer Nuosu(iii) Transducer] | + | *[https://github.com/AgnesWang000/Nuosu-Transducer Nuosu(iii) Transducer Repository] |
+ | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Contrastive_grammar Contrastive Grammar Documentation] | ||
+ | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Lexical_selection Lexical Selection Documentation] | ||
+ | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Structural_transfer Structural Transfer Documentation] | ||
+ | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Final_Project#Links_Public_Git_Repo_and_Poster Final Project: State-of-the-art MT and Transducer] | ||
+ | |||
+ | ==Polished RBMT System== | ||
+ | |||
+ | === Nuosu Transducer === | ||
+ | * Over : bible2.2022.1.31.txt (~249502 characters) | ||
+ | ** Precision: 97.6% | ||
+ | ** Recall: 86.46% | ||
+ | ** Coverage over large corpus: 75.43% | ||
+ | ** Stems in transducer: 300+ | ||
+ | |||
+ | === MT iii->zho === | ||
+ | * Over : iii.test.txt (30+ sentences) | ||
+ | ** Word Error Rate (WER): 38.6% | ||
+ | ** Position-independent word error rate (PER): 22.81% | ||
+ | ** Coverage: 78.5% | ||
== Additions == | == Additions == | ||
Line 21: | Line 40: | ||
===Adding Stems=== | ===Adding Stems=== | ||
* Added 100 new stems | * Added 100 new stems | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
[[Category: Nuosu]] [[Category: sp22_TranslationPairs]] | [[Category: Nuosu]] [[Category: sp22_TranslationPairs]] |
Latest revision as of 21:23, 12 May 2022
Contents
Resources for machine translation between Nuosu and English
Developed Resources
- Nuosu-Mandarin Machine Translation Repository
- Mandarin(zho) Transducer Repository
- Nuosu(iii) Transducer Repository
- Contrastive Grammar Documentation
- Lexical Selection Documentation
- Structural Transfer Documentation
- Final Project: State-of-the-art MT and Transducer
Polished RBMT System
Nuosu Transducer
- Over : bible2.2022.1.31.txt (~249502 characters)
- Precision: 97.6%
- Recall: 86.46%
- Coverage over large corpus: 75.43%
- Stems in transducer: 300+
MT iii->zho
- Over : iii.test.txt (30+ sentences)
- Word Error Rate (WER): 38.6%
- Position-independent word error rate (PER): 22.81%
- Coverage: 78.5%
Additions
Disambiguation
- Added a rule that distinguishes adverbs and main verbs (ex. ꃅ)
- Added a rule that distinguishes exhaustion particles and main verbs for (ex. ꌐ)
- Added a rule that distinguishes end-of-sentence particles and main verbs for (ex. ꄉ)
Structural Transfer
- Added one more sentence for each of the five rules, bringing the total to 10 sentences
- Added a rule which added an adjective-specific marker when translating to Mandarin.
- Added a rule which specified the explicit topic marker in Nuosu
- Changed the rule that specified the organization of NP and ClP phrases
Adding Stems
- Added 100 new stems