Difference between revisions of "Nuosu and Mandarin"
From LING073
(→Developed Resources) |
(→Resources for machine translation between Nuosu and English) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 8: | Line 8: | ||
*[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Lexical_selection Lexical Selection Documentation] | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Lexical_selection Lexical Selection Documentation] | ||
*[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Structural_transfer Structural Transfer Documentation] | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Structural_transfer Structural Transfer Documentation] | ||
+ | *[https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Final_Project#Links_Public_Git_Repo_and_Poster Final Project: State-of-the-art MT and Transducer] | ||
==Polished RBMT System== | ==Polished RBMT System== | ||
=== Nuosu Transducer === | === Nuosu Transducer === | ||
− | * Precision: % | + | * Over : bible2.2022.1.31.txt (~249502 characters) |
− | * Recall: % | + | ** Precision: 97.6% |
− | * Coverage over large corpus: | + | ** Recall: 86.46% |
− | * Stems in transducer: | + | ** Coverage over large corpus: 75.43% |
+ | ** Stems in transducer: 300+ | ||
=== MT iii->zho === | === MT iii->zho === | ||
− | * Over : iii. | + | * Over : iii.test.txt (30+ sentences) |
− | ** Word Error Rate (WER): % | + | ** Word Error Rate (WER): 38.6% |
− | ** Position-independent word error rate (PER): % | + | ** Position-independent word error rate (PER): 22.81% |
− | ** | + | ** Coverage: 78.5% |
− | |||
− | |||
== Additions == | == Additions == |
Latest revision as of 21:23, 12 May 2022
Contents
Resources for machine translation between Nuosu and English
Developed Resources
- Nuosu-Mandarin Machine Translation Repository
- Mandarin(zho) Transducer Repository
- Nuosu(iii) Transducer Repository
- Contrastive Grammar Documentation
- Lexical Selection Documentation
- Structural Transfer Documentation
- Final Project: State-of-the-art MT and Transducer
Polished RBMT System
Nuosu Transducer
- Over : bible2.2022.1.31.txt (~249502 characters)
- Precision: 97.6%
- Recall: 86.46%
- Coverage over large corpus: 75.43%
- Stems in transducer: 300+
MT iii->zho
- Over : iii.test.txt (30+ sentences)
- Word Error Rate (WER): 38.6%
- Position-independent word error rate (PER): 22.81%
- Coverage: 78.5%
Additions
Disambiguation
- Added a rule that distinguishes adverbs and main verbs (ex. ꃅ)
- Added a rule that distinguishes exhaustion particles and main verbs for (ex. ꌐ)
- Added a rule that distinguishes end-of-sentence particles and main verbs for (ex. ꄉ)
Structural Transfer
- Added one more sentence for each of the five rules, bringing the total to 10 sentences
- Added a rule which added an adjective-specific marker when translating to Mandarin.
- Added a rule which specified the explicit topic marker in Nuosu
- Changed the rule that specified the organization of NP and ClP phrases
Adding Stems
- Added 100 new stems