Difference between revisions of "Miskito and English"
From LING073
(→MT miq → eng) |
(→Evaluations) |
||
(5 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i> | <i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i> | ||
+ | |||
+ | ==Developed Resources== | ||
+ | *[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng Miq-Eng Lexican transfer repo] | ||
+ | *[https://github.swarthmore.edu/Ling073-sp21/ling073-miq Miq transducer repo] | ||
+ | *[https://github.com/apertium/apertium-eng apertium-eng repo] | ||
+ | *[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng-corpus Corpus repo] | ||
+ | |||
== Initial Miq → Eng Evaluation == | == Initial Miq → Eng Evaluation == | ||
===Corpus Coverage=== | ===Corpus Coverage=== | ||
Line 75: | Line 82: | ||
===Evaluations=== | ===Evaluations=== | ||
− | * Precision | + | *Precision: 100.00000% |
+ | *Recall: 87.00787% | ||
* Coverage over large corpus | * Coverage over large corpus | ||
Line 83: | Line 91: | ||
====MT miq → eng==== | ====MT miq → eng==== | ||
− | Tests run on miq.longer.txt from | + | <i>Tests run on miq.longer.txt from our bilingual repo</i> |
* Word error rate (WER): 91.11 % | * Word error rate (WER): 91.11 % | ||
Line 89: | Line 97: | ||
* Percentage of unknown words: 3.53 % | * Percentage of unknown words: 3.53 % | ||
− | == | + | Trimmed coverage over miq.longer.txt |
− | + | * coverage: 95 / 98 (~0.96938775510204081633) | |
− | * | + | |
− | * | + | ====MT eng → miq==== |
− | * | + | <i>Tests run on eng.longer.txt from our bilingual repo</i> |
+ | |||
+ | * Word error rate (WER): 100.00 % | ||
+ | * Position-independent word error rate (PER): 96.55 % | ||
+ | * Percentage of unknown words: 46.67 % | ||
+ | |||
+ | Trimmed coverage over eng.longer.txt | ||
+ | * coverage: 67 / 109 (~0.61467889908256880734) |
Latest revision as of 03:20, 23 May 2021
Note: Resources for machine translation between Miskito and English
Developed Resources
Initial Miq → Eng Evaluation
Corpus Coverage
Monolingual coverage: 52 / 116 (~0.44827586206896551724)
Bilingual coverage: coverage: 37 / 118 (~0.31355932203389830508)
Sentence Analysis
Sentence 1
Yang wal wîna kalila piras
- Intended: I don’t eat meat and chicken.
- ^Yang<prn><p1>/I<prn><p1><sg>$ ^wal<post>/with<pr>$ ^wîna<n>/meat<n>$ ^kalila<n>/chicken<n>$ ^piaia<v><pres><neg>/eat<vblex><pres><neg>$^.<sent>/.<sent>$
- #I with #meat #chicken #eat
Sentence 2
Yang siknis brisna mango.
- Intended: I’m allergic to mangos.
- ^Yang<prn><p1>/I<prn><p1><sg>$ ^siknis<n>/allergy<n>$ ^briaia<v><pres><p1>/have<vbhaver><pres><p1>$ ^mango<n>/mango<n>$^.<sent>/.<sent>$
- #I #allergy #have #mango
Sentence 3
Dia prais brisma cuartocana timia kumi?
- Intended: How much is a room?
- ^Dia<det><itg>/What<det><itg>$ ^prais<n>/cost<n>$ ^briaia<v><pres><p2>/have<vbhaver><pres><p2>$ ^cuartocana<n>/room<n>$ ^timia<n>/night<n>$ ^kumi<adj>/one<adj>$^?<sent>/?<sent>$^.<sent>/.<sent>$
- #What #cost #have #room #one #night?
Sentence 4
- Intended: Is there a place to sleep?
- Lexical transfer: *Manuapo *ansara #place *barsara *yapaya?
- Miskito: Manuapo ansara pleiski barsara yapaya?
Sentence 5
- Intended: Is there a place to wash clothes ?
- Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya?
- Miskito: Nara pleis kum apo cuala tuskaya?
Sentence 6
- Intended: Tomorrow, I will stay at this lovely hotel.
- Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain.
- Miskito: Yang yauhka takash keisna naha hotel pain.
Sentence 7
- Intended: Is there a place to rent bikes?
- Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya?
- Miskito: Pleeskum barsekei baisikil rin monaya?
Sentence 8
- Intended: What are you doing tonight?
- Lexical transfer: *Dia *dow *keisma *na #night.
- Miskito: Dia dow keisma na timia.
Sentence 9
- Intended: What do you think of the government?
- Lexical transfer: #You *dia *lukisma *naho *gabament?
- Miskito: Man dia lukisma naho gabament?
Sentence 10
- Intended: Today is Friday.
- Lexical transfer: *Naiwa *lika *praidi.
- Miskito: Naiwa lika praidi.
Final Evaluation
Additions
- 200+ stems added
- 2 new lexical selection rules
- sangni
- taya
- 3 new disambiguation rules in miq transducer for possessive nouns/verbs and object pronouns
- 1 new transfer rule for possessive nouns
- Also debugged plural nouns transfer rule
Evaluations
- Precision: 100.00000%
- Recall: 87.00787%
- Coverage over large corpus
- ~70.08%
- Tokens in corpus: 13, 819
- Stems in Transducer: 344
MT miq → eng
Tests run on miq.longer.txt from our bilingual repo
- Word error rate (WER): 91.11 %
- Position-independent word error rate (PER): 73.33 %
- Percentage of unknown words: 3.53 %
Trimmed coverage over miq.longer.txt
- coverage: 95 / 98 (~0.96938775510204081633)
MT eng → miq
Tests run on eng.longer.txt from our bilingual repo
- Word error rate (WER): 100.00 %
- Position-independent word error rate (PER): 96.55 %
- Percentage of unknown words: 46.67 %
Trimmed coverage over eng.longer.txt
- coverage: 67 / 109 (~0.61467889908256880734)